Paper
2 October 1998 Content-based classification and retrieval of audio
Author Affiliations +
Abstract
An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.
© (1998) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tong Zhang and C.-C. Jay Kuo "Content-based classification and retrieval of audio", Proc. SPIE 3461, Advanced Signal Processing Algorithms, Architectures, and Implementations VIII, (2 October 1998); https://doi.org/10.1117/12.325703
Lens.org Logo
CITATIONS
Cited by 54 scholarly publications and 6 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature extraction

Classification systems

Databases

Analytical research

Statistical analysis

Video

Signal detection

Back to Top