Paper
15 January 1997 Video classification using speaker identification
Nilesh V. Patel, Ishwar K. Sethi
Author Affiliations +
Abstract
Video content characterization is a challenging problem in video databases. The aim of such characterization is to generate indices that can describe a video clip in terms of objects and their actions in the clip. Generally, such indices are extracted by performing image analysis on the video clips. Many such indices can also be generated by analyzing the embedded audio information of video clips. Indices pertaining to context, scene emotion, and actors or characters present in a video clip appear especially suitable for generation via audio analysis techniques of keyword spotting, and speech and speaker recognition. In this paper, we examine the potential of speaker identification techniques for characterizing video clips in terms of actors present in them. We describe a three-stage processing system consisting of a shot boundary detection stage, an audio classification stage, and a speaker identification stage to determine the presence of different actors in isolated shots. Experimental results using the movie A Few Good Men are presented to show the efficacy of speaker identification for labeling video clips in terms of persons present in them.
© (1997) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Nilesh V. Patel and Ishwar K. Sethi "Video classification using speaker identification", Proc. SPIE 3022, Storage and Retrieval for Image and Video Databases V, (15 January 1997); https://doi.org/10.1117/12.263411
Lens.org Logo
CITATIONS
Cited by 15 scholarly publications and 1 patent.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Speaker recognition

Databases

Autoregressive models

Classification systems

Feature extraction

Image analysis

Back to Top