Paper
15 July 2022 W2V-ATT: research on text-dependent MDD method based on wav2vec2.0
Ruitao Li, Xiaochen Lai
Author Affiliations +
Proceedings Volume 12258, International Conference on Neural Networks, Information, and Communication Engineering (NNICE 2022); 1225804 (2022) https://doi.org/10.1117/12.2639245
Event: International Conference on Neural Networks, Information, and Communication Engineering (NNICE 2022), 2022, Qingdao, China
Abstract
Mispronunciation Detection and Diagnosis (MDD) is one of the key components of the Computer Assisted Pronunciation Training (CAPT) system. The construction of the current mainstream MDD system is an automatic speech recognition (ASR) system based on DNN-HMM, on which a large amount of labeled data is required for training. In this paper, the self-supervised pre-training model wav2vec2.0 is applied to the MDD task. Self-supervised pre-training uses a large amount of unlabeled data to learn common features, and only a small amount of labeled data is required for training in subsequent applications. In order to utilize the prior text information, the audio features are combined with the text features through the attention mechanism, and the information of both is used in the decoding process. The experiment is conducted on the publicly available L2-Aritic and TIMIT datasets, yielding satisfactory results.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ruitao Li and Xiaochen Lai "W2V-ATT: research on text-dependent MDD method based on wav2vec2.0", Proc. SPIE 12258, International Conference on Neural Networks, Information, and Communication Engineering (NNICE 2022), 1225804 (15 July 2022); https://doi.org/10.1117/12.2639245
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Computer programming

Feature extraction

Performance modeling

Network architectures

Speech recognition

Neural networks

Back to Top