Multimedia Event Detection (MED) is a multimedia retrieval task with the goal of finding videos of a particular event in a large-scale Internet video archive, given example videos and text descriptions. In this paper, we mainly focus on an 'ad-hoc' scenario in MED where we do not use any example video. We aim to retrieve test videos based on their visual semantics using a Visual Concept Signature (VCS) generated for each event only derived from the event description provided as the query. Visual semantics are described using the Semantic INdexing (SIN) feature which represents the likelihood of predefined visual concepts in a video. To generate a VCS for an event, we project the given event description to a visual concept list using the proposed textual semantic similarity. Exploring SIN feature properties, we harmonize the generated visual concept signature and the SIN feature to improve retrieval performance. We conduct different experiments to assess the quality of generated visual concept signatures with respect to human expectation, and in the context of the MED task to retrieve the SIN feature of videos in the test dataset when we have no or only very few training videos.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.