TextCycleGAN: cyclical-generative adversarial networks for image captioning

Mohammad R. Alam; Nicole A. Isoda; Mitch C. Manzanares; Anthony C. Delgado; Antonius F. Panggabean

doi:10.1117/12.2585549

12 April 2021 TextCycleGAN: cyclical-generative adversarial networks for image captioning

Mohammad R. Alam, Nicole A. Isoda, Mitch C. Manzanares, Anthony C. Delgado, Antonius F. Panggabean

Proceedings Volume 11746, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III; 117460Z (2021) https://doi.org/10.1117/12.2585549
Event: SPIE Defense + Commercial Sensing, 2021, Online Only

Abstract

In this study, we approach the problem of image captioning with cycle consistent generative adversarial networks (CycleGANs). Due to CycleGANs’ ability to learn functions to map between multiple domains and use duality to strengthen each individual mapping with the usage of a cycle consistency loss, these models show great promise in their ability to learn both image captioning and image synthesis and to create a better image captioning framework. Historically, cycle consistency loss was based on the premise that the input should undergo little to no change when mapped to another domain and then back to its original; however, image captioning presents a unique challenge to this concept due to the many-to-many nature of the mapping from images to captions and vice-versa. TextCycleGAN overcomes this obstacle through utilization of cycle consistency in the feature space and is, thereby, able to perform well on both image captioning and synthesis. We will demonstrate its capability as an image captioning framework and discuss how its model architecture makes this possible.

Conference Presentation

Citation Download Citation

Mohammad R. Alam, Nicole A. Isoda, Mitch C. Manzanares, Anthony C. Delgado, and Antonius F. Panggabean "TextCycleGAN: cyclical-generative adversarial networks for image captioning", Proc. SPIE 11746, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, 117460Z (12 April 2021); https://doi.org/10.1117/12.2585549

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available