Paper
29 April 2013 Defining properties of speech spectrogram images to allow effective pre-processing prior to pattern recognition
Author Affiliations +
Abstract
The speech signal of a word is a combination of frequencies which can produce specific transition frequency shapes. These can be regarded as a written text in some unknown ‘script’. Before attempting methods to read the speech spectrogram image using image processing techniques we need first to define the properties of the speech spectrogram image as well as the reduction of the clutter of the spectrogram image and the selection of the methods to be employed for image matching. Thus methods to convert the speech signal to a spectrogram image are initially employed, followed by reduction of the noise in the signal by capturing the energy associated with formants of the speech signal. This is followed by the normalisation of the size of the image and its resolution of in both the frequency and time axes. Finally, template matching methods are employed to recognise portions of text and isolated words. The paper describes the pre-processing methods employed and outlines the use of normalised grey-level correlation for the recognition of words.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mohammed Al-Darkazali, Rupert Young, Chris Chatwin, and Philip Birch "Defining properties of speech spectrogram images to allow effective pre-processing prior to pattern recognition", Proc. SPIE 8748, Optical Pattern Recognition XXIV, 87480G (29 April 2013); https://doi.org/10.1117/12.2014511
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Fourier transforms

Image segmentation

Neodymium

Image processing

Analytical research

Pattern recognition

Image resolution

RELATED CONTENT

Image segmentation by multiresolution histogram decomposition
Proceedings of SPIE (September 01 1995)
Multiscale watersheds and pattern recognition in images
Proceedings of SPIE (March 28 1995)
Detecting man-made objects in aerial reconnaissance images
Proceedings of SPIE (October 20 1993)
Evaluating the resolution of a CD-SEM
Proceedings of SPIE (July 16 2002)

Back to Top