We propose an articulation awareness system with a three-dimensional (3D) tongue using virtual reality (VR). Human speech sounds are made through a combination of vocal fold vibration (voice source) and tongue or lip motion (articulation). Articulation is the movement of the jaws, tongue, and lips. A speaker should pay attention to the importance of these movements. In this study, the tongue shape is visualized to raise awareness about the speech organ. The subjects observed inside and outside of the mouth as if they were the size of the thumb. Models of the oral area were created from magnetic resonance imaging data collected during vowel production. The subjects reported that they were aware of the articulators after experiencing the 3D tongue using a VR system.
Magnetic Resonance Imaging (MRI) was used for analyzing speech articulations. A synchronized sampling method, which is that a subject person repeats speaking a same pseudo word at a tempo by hearing click sounds, was used to enable capturing speech articulators' motion data from relatively slow imaging. For Japanese syllable sequences of /a-ei- u-e-o-a-o/, /ka-ke-ki-ku-ke-ko-ka-ko/, and /ga-ge-gi-gu-ge-go-ga-go/, positions and motions of speech articulators were compared for analyzing detailed differences of speech articulations depending on patterns of consonants and vowels. To analyze time series of narrowing positions of a vocal tract caused mainly by tongue motions, a kymograph method from MRI images is proposed. A thin slice is taken from MRI image in a vertical direction relative to a vocal tract. This thin sliced image has upper and lower boundaries of a vocal tract, that is, a hard or soft palate and an upper surface of a tongue. The same slices are taken from all frames and then they are put in a line in a sequence of time. This sequence of sliced image is looked like a kymograph. It will be a useful tool to observe time series of distance changes of one section of a vocal tract, caused by tongue motions.
We propose a speech motion feedback system for improving articulation by showing 3D-CG jaw and abstracted lip. The subject's motion of the jaw is captured by a 3D position with rotation sensor. The lip motion is measured by four 3D position sensors using infrared emission. Subjects observe his/her own face on LCD screen, and 3D-CG jaw and abstracted lip motion on semi-transparent screen. Subjects reported that they noticed the importance of the motion of the speech organs after the experiment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.