Translator Disclaimer
Paper
3 January 2020 Speech enhancement based on spectrogram conditional generative adversarial networks
Author Affiliations +
Proceedings Volume 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019); 113732S (2020) https://doi.org/10.1117/12.2557256
Event: Eleventh International Conference on Graphics and Image Processing, 2019, Hangzhou, China
Abstract
Voice is the main way of communication and information sharing with others, It brings great convenience to human life. The existing speech recognition classification has the problem of considerable performance attenuation to environment noise and accent. Most of these problems can be mitigated by training on large amounts of data. However, collecting large Numbers of high-quality datasets in real life is time-consuming and expensive. In order to solve this problem, this paper proposes a data enhancement method,which is suitable for the speech image extension of small samples. S-GAN is used to generate datasets that conform to the real distribution of samples, and GAN-train and GAN-test methods are used to evaluate the quality and diversity of network generated images. Meanwhile, the spatial transformation network (STN) and CNN framework are combined to get the useful information part of the data for data classification. The results show that this method can significantly improve the classification accuracy of speech recognition and lay a foundation for small sample data enhancement.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ru Han, Jianming Liu, and Mingwen Wang "Speech enhancement based on spectrogram conditional generative adversarial networks", Proc. SPIE 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019), 113732S (3 January 2020); https://doi.org/10.1117/12.2557256
PROCEEDINGS
10 PAGES


SHARE
Advertisement
Advertisement
Back to Top