In this work, a video synthesis model based on Generative Adversarial Networks (Human GAN) is proposed, whose objective is to generate a photorealistic output by learning the mapping function from an input source to output video. However, the image to image generation is a quite popular problem, but the video synthesis problem is still unexplored. Directly employing existing image generation method without taking temporal dynamics into account leads to frequent temporally incoherent output with low visual quality. The proposed approach solves this problem by wisely designing generators and discriminators combined with Spatio-temporal adversarial objects. While comparing it to some robust baselines on public benchmarks, the proposed model proves to be superior in generating temporally coherent videos with extremely low artifacts. And results achieved by the proposed model are more realistic on both quantitative and qualitative measures compared to other existing baselines techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.