Paper
1 March 2023 Semantic enhancement methods for image captioning
Luming Cui, Lin Li
Author Affiliations +
Proceedings Volume 12588, International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022); 1258807 (2023) https://doi.org/10.1117/12.2667270
Event: International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022), 2022, Chongqing, China
Abstract
Image captioning, a cross-modal study, aims to generating a description for a given image, which plays an important role in many fields like image retrieval and computer-assisted instruction. Currently, the challenge in image captioning is the limited quality of generated descriptions including insufficient utilization of image feature information and the limited language learning ability of the decoder. In this paper, we address the above problems by constructing a semantic enhancement module and a multi-round decoding mechanism to enhance the decoding ability of the model, which uses the Transformer model as the primary structure. To validate the efficacy of the model, we conducted intensive experiments on the MSCOCO2014 benchmark and evaluated its performance using five evaluation metrics. The experimental results show that the proposed method in this paper has improved to varying degrees on all five-evaluation metrics.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Luming Cui and Lin Li "Semantic enhancement methods for image captioning", Proc. SPIE 12588, International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022), 1258807 (1 March 2023); https://doi.org/10.1117/12.2667270
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Semantics

Information visualization

Image enhancement

Visual process modeling

Performance modeling

Data modeling

Back to Top