You have requested a machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Neither SPIE nor the owners and publishers of the content make, and they explicitly disclaim, any express or implied representations or warranties of any kind, including, without limitation, representations and warranties as to the functionality of the translation feature or the accuracy or completeness of the translations.
Translations are not retained in our system. Your use of this feature and the translations is subject to all use restrictions contained in the Terms and Conditions of Use of the SPIE website.
12 March 2008Computer-aided segmentation and 3D analysis of in vivo MRI examinations of the human vocal tract during phonation
We developed, tested, and evaluated a 3D segmentation and analysis system for in vivo MRI examinations of the human
vocal tract during phonation. For this purpose, six professionally trained speakers, age 22-34y, were examined using a
standardized MRI protocol (1.5 T, T1w FLASH, ST 4mm, 23 slices, acq. time 21s). The volunteers performed a
prolonged (≥21s) emission of sounds of the German phonemic inventory. Simultaneous audio tape recording was
obtained to control correct utterance. Scans were made in axial, coronal, and sagittal planes each. Computer-aided
quantitative 3D evaluation included (i) automated registration of the phoneme-specific data acquired in different slice
orientations, (ii) semi-automated segmentation of oropharyngeal structures, (iii) computation of a curvilinear vocal tract
midline in 3D by nonlinear PCA, (iv) computation of cross-sectional areas of the vocal tract perpendicular to this
midline. For the vowels /a/,/e/,/i/,/o/,/ø/,/u/,/y/, the extracted area functions were used to synthesize phoneme sounds
based on an articulatory-acoustic model. For quantitative analysis, recorded and synthesized phonemes were compared,
where area functions extracted from 2D midsagittal slices were used as a reference. All vowels could be identified
correctly based on the synthesized phoneme sounds. The comparison between synthesized and recorded vowel
phonemes revealed that the quality of phoneme sound synthesis was improved for phonemes /a/ and /y/, if 3D instead of
2D data were used, as measured by the average relative frequency shift between recorded and synthesized vowel
formants (p<0.05, one-sided Wilcoxon rank sum test). In summary, the combination of fast MRI followed by subsequent
3D segmentation and analysis is a novel approach to examine human phonation in vivo. It unveils functional anatomical
findings that may be essential for realistic modelling of the human vocal tract during speech production.
The alert did not successfully save. Please try again later.
Axel Wismüller, Johannes Behrends, Phil Hoole, Gerda L. Leinsinger, Anke Meyer-Baese, Maximilian F. Reiser, "Computer-aided segmentation and 3D analysis of in vivo MRI examinations of the human vocal tract during phonation," Proc. SPIE 6916, Medical Imaging 2008: Physiology, Function, and Structure from Medical Images, 69160T (12 March 2008); https://doi.org/10.1117/12.770836