Presentation + Paper
12 April 2021 TextCycleGAN: cyclical-generative adversarial networks for image captioning
Author Affiliations +
Abstract
In this study, we approach the problem of image captioning with cycle consistent generative adversarial networks (CycleGANs). Due to CycleGANs’ ability to learn functions to map between multiple domains and use duality to strengthen each individual mapping with the usage of a cycle consistency loss, these models show great promise in their ability to learn both image captioning and image synthesis and to create a better image captioning framework. Historically, cycle consistency loss was based on the premise that the input should undergo little to no change when mapped to another domain and then back to its original; however, image captioning presents a unique challenge to this concept due to the many-to-many nature of the mapping from images to captions and vice-versa. TextCycleGAN overcomes this obstacle through utilization of cycle consistency in the feature space and is, thereby, able to perform well on both image captioning and synthesis. We will demonstrate its capability as an image captioning framework and discuss how its model architecture makes this possible.
Conference Presentation
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mohammad R. Alam, Nicole A. Isoda, Mitch C. Manzanares, Anthony C. Delgado, and Antonius F. Panggabean "TextCycleGAN: cyclical-generative adversarial networks for image captioning", Proc. SPIE 11746, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, 117460Z (12 April 2021); https://doi.org/10.1117/12.2585549
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
Back to Top