Paper
10 April 2018 Generating description with multi-feature fusion and saliency maps of image
Lisha Liu, Yuxuan Ding, Chunna Tian, Bo Yuan
Author Affiliations +
Proceedings Volume 10615, Ninth International Conference on Graphic and Image Processing (ICGIP 2017); 106151D (2018) https://doi.org/10.1117/12.2304845
Event: Ninth International Conference on Graphic and Image Processing, 2017, Qingdao, China
Abstract
Generating description for an image can be regard as visual understanding. It is across artificial intelligence, machine learning, natural language processing and many other areas. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with object attention and multi-feature of images. The deep recurrent neural networks have excellent performance in machine translation, so we use it to generate natural sentence description for images. The proposed method uses single CNN (convolution neural network) that is trained on ImageNet to extract image features. But we think it can not adequately contain the content in images, it may only focus on the object area of image. So we add scene information to image feature using CNN which is trained on Places205. Experiments show that model with multi-feature extracted by two CNNs perform better than which with a single feature. In addition, we make saliency weights on images to emphasize the salient objects in images. We evaluate our model on MSCOCO based on public metrics, and the results show that our model performs better than several state-of-the-art methods.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lisha Liu, Yuxuan Ding, Chunna Tian, and Bo Yuan "Generating description with multi-feature fusion and saliency maps of image", Proc. SPIE 10615, Ninth International Conference on Graphic and Image Processing (ICGIP 2017), 106151D (10 April 2018); https://doi.org/10.1117/12.2304845
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image fusion

Content addressable memory

Feature extraction

Neural networks

Picosecond phenomena

Control systems

Image processing

Back to Top