Paper
31 January 2020 Research for image caption based on global attention mechanism
Tong Wu, Tao Ku, Hao Zhang
Author Affiliations +
Proceedings Volume 11427, Second Target Recognition and Artificial Intelligence Summit Forum; 114272U (2020) https://doi.org/10.1117/12.2552711
Event: Second Target Recognition and Artificial Intelligence Summit Forum, 2019, Changchun, China
Abstract
Convolution Neural Networks (CNN) and Recurrent Neural Networks (RNN), which are the main research methods of image caption, have developed rapidly. Nevertheless lacking of global consciousness in image caption has not been completely solved. Separation from the bottom-up visual attention mechanism and the top-down visual attention mechanism has been widely used in image description and visual question and answers. In this article, we put forward image description based on a global attention mechanism research methods. The global attention prior channel is added to the infrastructure to extract the global information features while learning the local features. The attention to the object and other outstanding level of image region is calculated, and the global image features are enhanced.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tong Wu, Tao Ku, and Hao Zhang "Research for image caption based on global attention mechanism", Proc. SPIE 11427, Second Target Recognition and Artificial Intelligence Summit Forum, 114272U (31 January 2020); https://doi.org/10.1117/12.2552711
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Neural networks

Performance modeling

Feature extraction

Computer programming

Convolutional neural networks

Image compression

Visualization

Back to Top