This paper proposes a ship detection method based on contour extraction networks and minimum bounding box generation algorithms. The proposed method utilizes contour extraction networks to extract the contour of the ship target and generate a set of key points. Then, the minimum bounding box generation algorithm is used to transform these key points into the rotational detection box of the target, which can more accurately describe the shape and orientation of ships, thus improving the accuracy and reliability of detection. The experimental results show that the proposed method can effectively detect ships of various orientations and postures and has better performance than traditional rectangular box detection methods. The paper compares rotated bounding box detection methods based on traditional machine learning and deep learning and concludes that rotated bounding box detection methods based on deep learning can better solve the problem of ship detection in complex situations, but require a large amount of data for training and may require a higher computational cost. Finally, the paper introduces the Minimum Bounding Rectangle Algorithm used in the instance segmentation process.
As a gradient-guided search method, differentiable architecture search greatly reduces computational costs and improves search speed compared with traditional reinforcement learning methods and evolutionary methods that search for network structures in discrete spaces. However, when the number of search epochs is too large, the searched architecture will contain a lot of skip connections, resulting in a sharp decline in network performance. Aiming at this phenomenon, this paper designs a phased search process. With the deepening of the search stage, different early stopping rules are designed, so that the units located in different positions of the network can present different structures, effectively solving the performance crash problem caused by skip connections. At the same time, the design of edge normalization is introduced on the connection between nodes, and the difference between the weight parameters of different operations is enlarged by improving the loss function, which effectively improves the stability of the architecture search. The experimental results show that this method trades off a small amount of parameters and search time in exchange for an improvement in accuracy, and the verification accuracy on the cifar10 and cifar100 datasets has increased by 0.15% and 1.36%, respectively.
Aiming at the problem of low reliability of ship model recognition from remote sensing images, we propose a ship model recognition method based on multi-view learning. Firstly, the multi-view feature data is constructed by different feature operators. Then the single feature view data is used to train SVM classifier respectively while the multi-view data is fused and used to train classifier by CPM-Nets. Finally, we fuse the results of single feature view classifier and multi-feature view classifier by classifier aggregation to enhance the accuracy rate. Experiments show that the proposed method can improve the accuracy of ship type recognition.
The representation learning of knowledge graph aims to represent the semantic information of entities and relations as dense low-dimensional real-valued vectors and map them to the same low-dimensional space. Existing methods often focus on the single-modal information of the text and ignore the information of the image modality, resulting in the ineffective use of entity feature information in the image. And there is entity-related descriptive information in most knowledge graphs, which are not well used in current multimodal knowledge representation learning methods. In this regard, a multimodal knowledge representation learning method combining description information is proposed. This method combines multimodal (image, text) data to construct a knowledge representation learning model and combines its corresponding brief description information to improve the representation effect of multimodal data. Experimental results show that the method performs well on triple classification and link prediction tasks on the constructed WI-D dataset.
Cross-modal retrieval has been widely used in the Vision-Language field and has achieved many results, but there is a lack of research in the trajectory-text field. At the same time, the current popular cross-modal retrieval models not only lack fine-grained semantic alignment between different modalities, but also ignore the influence of the grammatical structure of the text on the retrieval effect. To solve the above problems, this paper proposes a dual-stream trajectory text retrieval model combined with graph neural network, combining local and global two cross-modal interaction methods: (1) Local alignment, encoding trajectory points and words respectively after passing through the masking module. Semantic alignment. (2) Global alignment, introducing momentum contrastive learning to achieve trajectory and text retrieval learning. Experimental results show that this hierarchical matching method not only retains the efficient performance of the dual-stream model, but also has higher accuracy than other cross-modal retrieval models, and its R@1 value on the dataset is improved by 3.2%-4.7%.
Event extraction is a key research direction in the field of information extraction. In order to improve the effect of event extraction and solve the problem that the general event extraction method cannot make full use of the text feature information, an event extraction method integrating trigger word features is proposed. By building a remote trigger thesaurus, we can provide additional feature information for the event type classification model, enhance the ability of discovering event trigger words. Then the event arguments extraction model integrates the event type and trigger distance features to improve the representation learning ability. Finally, connecting the event type classification model and the event arguments extraction model in series to complete event extraction. Experiments are carried out on the DuEE dataset and the result shows that our model has more outstanding performance than other models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.