Paper
6 October 1997 Keyword spotting for multimedia document indexing
Philippe Gelin, Christian J. Wellekens
Author Affiliations +
Proceedings Volume 3229, Multimedia Storage and Archiving Systems II; (1997) https://doi.org/10.1117/12.290357
Event: Voice, Video, and Data Communications, 1997, Dallas, TX, United States
Abstract
We tackle the problem of multimedia indexing using keyword spotting on the spoken part of the data. Word spotting systems for indexing have to meet vary hard specifications: short response times to queries, speaker independent mode, open vocabulary in order to be able to track any keyword. To meet these constraints keyword models should be build according to their phonetic spelling and the process should be divided in two parts: preprocessing of the speech signal and query over a lattice of hypotheses. Different classification criteria have been studied for hypothesis generation: frame labeling, maximum likelihood and maximum a posteriori (MAP). The hypothesis probability is computed either through standard gaussian model or through a hybrid Hidden Markov Model-Neural Network. The training of the phonemic models is based either on Viterbi alignment or on recursive estimation and maximization of a posteriori probabilities. In the latter discriminant properties between phonemes are enforced. Tests have been conducted on TIMIT database as well as on TV news soundtracks. Interesting results have been obtained in time saving for the documentalist. The ultimate goal is to couple the soundtrack indexing with tools for video indexing in order to enhance the robustness of the system.
© (1997) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Philippe Gelin and Christian J. Wellekens "Keyword spotting for multimedia document indexing", Proc. SPIE 3229, Multimedia Storage and Archiving Systems II, (6 October 1997); https://doi.org/10.1117/12.290357
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Acoustics

Neural networks

Databases

Multimedia

Signal processing

Video

Precision measurement

Back to Top