Abstract

The paper discusses a tool for video structure analysis, feature extraction, classification and semantic querying suitable for an extremely broad scale of video data set. The tool analyses the video structure to detect shot boundaries where shots in each video are identified using image duplication techniques. A single frame from each shot is passed to a deep learning model implemented using TensorFlow, that is trained for feature extraction and classification of objects in each frame. Subsequently, an automatic textual annotation is generated for each video and finally with the aid of ontology, semantic searching is done using NLP, which allows receiving an efficient result other than manual video annotation of a large scale dataset. While maintaining accurate querying with automatic video content analysis and annotation with semantic searching with around seventy-four percent accuracy rate, this becomes a useful tool in video tagging and annotation.

Keywords:
Computer science Annotation Artificial intelligence Frame (networking) Feature extraction Shot (pellet) Feature (linguistics) Semantic feature Semantics (computer science) Set (abstract data type) Video content analysis Video tracking Information retrieval Video processing

Metrics

13
Cited By
1.00
FWCI (Field Weighted Citation Impact)
13
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.