Paper
25 October 1994 Alternative linear predictive analysis techniques with applications to speaker identification
Ravi P. Ramachandran, M. S. Zilovic, Richard J. Mammone
Author Affiliations +
Abstract
In this paper, various linear predictive (LP) analysis methods are studied and compared from the points of view of robustness to noise and of application to speaker identification. The key of the success of LP techniques is in separating the vocal tract information from the pitch information present in a speech signal even under noisy conditions. In addition to considering the conventional, one-shot weighted least-squares methods, we propose three other approaches with the above point as a motivation. The first is an iterative approach that leads to the weighted least absolute value solution. The second is an extension of the one-shot least-squares approach and achieves an iterative update of the weights. The update is a function of the residual and is based on minimizing a Mahalanobis distance. Thirdly, the weighted total least- squares formulation is considered. A study of the deviations in the LP parameters was done when noise (white Gaussian and impulsive) is added to the speech. It was revealed that the most robust method depends on the type of noise. A closed set speaker identification experiment with 20 speakers was conducted using a vector quantizer classifier trained on clean speech. For a modest codebook size of 32, all of the approaches are comparable when the testing condition corresponds to clean speech or speech degraded by white Gaussian noise. When the test involves speech degraded by impulse noise, the proposed approach based on minimizing a Mahalanobis distance which was found to be the most robust, is also the best for speaker identification.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ravi P. Ramachandran, M. S. Zilovic, and Richard J. Mammone "Alternative linear predictive analysis techniques with applications to speaker identification", Proc. SPIE 2277, Automatic Systems for the Identification and Inspection of Humans, (25 October 1994); https://doi.org/10.1117/12.191870
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Signal to noise ratio

Error analysis

Mahalanobis distance

Statistical analysis

Chemical elements

Smoothing

Speaker recognition

Back to Top