Paper
24 June 2005 An efficient video shot representation for fast video retrieval
Cheng Cai, Kin-Man Lam, Zheng Tan
Author Affiliations +
Proceedings Volume 5960, Visual Communications and Image Processing 2005; 59600P (2005) https://doi.org/10.1117/12.631564
Event: Visual Communications and Image Processing 2005, 2005, Beijing, China
Abstract
For video retrieval, a video is partitioned into a group of shots, which are then represented by either key frames or video shot representations. An optimal representation of a shot should include all the information about the frames concerned. In this paper, we propose an efficient representation scheme for a shot, which considers both the spatial frequency contents and the temporal statistics of the frames for video retrieval. In our scheme, each frame in a video shot is transformed into the frequency domain using the discrete cosine transform (DCT), and a number of values at each frequency are selected based on their probability of occurrence. This representation scheme allows retrieval to be carried out hierarchically, i.e. from low-frequency to high-frequency components. Experimental results show that our proposed scheme outperforms the alpha-trimmed average histogram method in terms of retrieval accuracy.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Cheng Cai, Kin-Man Lam, and Zheng Tan "An efficient video shot representation for fast video retrieval", Proc. SPIE 5960, Visual Communications and Image Processing 2005, 59600P (24 June 2005); https://doi.org/10.1117/12.631564
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Distance measurement

Distributed interactive simulations

Spatial frequencies

Information visualization

Visualization

Electronics engineering

Back to Top