JOURNAL ARTICLE

Enhanced ontologies for video annotation and retrieval

Abstract

A typical way to perform video annotation requires to classify video elements (e.g. events and objects) according to some pre-defined ontology of the video content domain. Ontologies are defined by establishing relationships between linguistic terms that specify domain concepts at different abstraction levels. However, although linguistic terms are appropriate to distinguish event and object categories, they are inadequate when they must describe specific or complex patterns of events or video entities. Instead, in these cases, pattern specifications can be better expressed using visual prototypes, either images or video clips, that capture the essence of the event or entity. Therefore enhanced ontologies, that include both visual and linguistic concepts, can be useful to support video annotation up to the level of detail of pattern specification.This paper presents algorithms and techniques that employ enriched ontologies for video annotation and retrieval, and discusses a solution for their implementation for the soccer video domain. An unsupervised clustering method is proposed in order to create pictorially enriched ontologies by defining visual prototypes that represent specific patterns of highlights and adding them as visual concepts to the ontology.Two algorithms that use pictorially enriched ontologies to perform automatic soccer video annotation are proposed and results for typical highlights are presented. Annotation is performed associating occurrences of events, or entities, to higher level concepts by checking their similarity to visual concepts that are hierarchically linked to higher level semantics, using a dynamic programming approach.Usage of reasoning on the ontology is shown, to perform higher-level annotation of the clips using the domain knowledge and to create complex queries that comprise visual prototypes of actions, their temporal evolution and relations.

Keywords:
Computer science Ontology Annotation Information retrieval Semantics (computer science) Event (particle physics) Domain (mathematical analysis) Image retrieval Object (grammar) Abstraction Artificial intelligence Natural language processing Image (mathematics) Programming language

Metrics

24
Cited By
3.68
FWCI (Field Weighted Citation Impact)
21
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimedia Communication and Technology
Social Sciences →  Social Sciences →  Sociology and Political Science
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.