Paper
1 November 2016 An exploratory study on the driving method of speech synthesis based on the human eye reading imaging data
Author Affiliations +
Proceedings Volume 10157, Infrared Technology and Applications, and Robot Sensing and Advanced Control; 101573O (2016) https://doi.org/10.1117/12.2248205
Event: International Symposium on Optoelectronic Technology and Application 2016, 2016, Beijing, China
Abstract
With the development of information technology and artificial intelligence, speech synthesis plays a significant role in the fields of Human-Computer Interaction Techniques. However, the main problem of current speech synthesis techniques is lacking of naturalness and expressiveness so that it is not yet close to the standard of natural language. Another problem is that the human-computer interaction based on the speech synthesis is too monotonous to realize mechanism of user subjective drive. This thesis introduces the historical development of speech synthesis and summarizes the general process of this technique. It is pointed out that prosody generation module is an important part in the process of speech synthesis. On the basis of further research, using eye activity rules when reading to control and drive prosody generation was introduced as a new human-computer interaction method to enrich the synthetic form. In this article, the present situation of speech synthesis technology is reviewed in detail. Based on the premise of eye gaze data extraction, using eye movement signal in real-time driving, a speech synthesis method which can express the real speech rhythm of the speaker is proposed. That is, when reader is watching corpora with its eyes in silent reading, capture the reading information such as the eye gaze duration per prosodic unit, and establish a hierarchical prosodic pattern of duration model to determine the duration parameters of synthesized speech. At last, after the analysis, the feasibility of the above method is verified.
© (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Pei-pei Gao and Feng Liu "An exploratory study on the driving method of speech synthesis based on the human eye reading imaging data", Proc. SPIE 10157, Infrared Technology and Applications, and Robot Sensing and Advanced Control, 101573O (1 November 2016); https://doi.org/10.1117/12.2248205
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
Back to Top