JOURNAL ARTICLE

Real Time Object Detection Tracking Using YOLO and Deep Sort

C HemashreeB. PallaviH N PruthviM. S. ShivagangaSanthosh Kumar S

Year: 2025 Journal:   International Journal of Advanced Research in Science Communication and Technology Pages: 105-114   Publisher: Shivkrupa Publication's

Abstract

For real-time object detection and speech alarms, Research offers a sophisticated blind aid solution that smoothly combines the YOLO algorithm with Open CV's DNN (Deep Neural Network) module. The main objective is to improve the safety and freedom of people with visual impairments by offering quick object recognition and audio feedback. The YOLO method, which is tuned for real-time inference, is used by the system to precisely recognize objects after using a webcam to record live video input. Then, using audio description, it produces speech notifications that provide crucial details about the things it has spotted. Because of its exceptional versatility, the research can provide speech outputs in the user's preferred language, increasing its usability and accessibility. Its versatility is further demonstrated by its capacity to precisely handle a variety of object classes, making it a priceless tool for greatly enhancing the lives of those who are blind or visually impaired. Challenges in real-time object detection include occlusion, scale variations, and cluttered environments. Researchers must navigate the trade-offs between accuracy and speed. Real- time object detection is pivotal in computer vision, enabling intelligent systems across diverse applications. Real-time object detection systems designed for assistive technologies are becoming increasingly important in empowering individuals with visual impairments. The research integrates the YOLO algorithm with OpenCV’s DNN module to deliver a highly responsive blind-aid solution capable of capturing live video through a webcam and processing it instantly. YOLO’s architecture, optimized for speed and accuracy, enables the system to detect and classify multiple objects within each video frame, even in dynamic or unpredictable environments. Once an object is identified, the system generates immediate speech output, providing descriptive audio alerts in the user’s preferred language, thus enhancing accessibility and user experience. This multilingual capability makes the solution versatile across different regions and cultures. Additionally, the system supports a wide range of object categories such as vehicles, household items, and obstacles helping visually impaired users navigate daily surroundings more safely and independently. However, the implementation faces challenges, including handling occlusions, varying object scales, low-light conditions, and cluttered backgrounds. These factors require careful balancing between detection accuracy and processing speed to maintain real-time performance. Despite these challenges, the integration of advanced deep learning models with user-friendly audio interfaces demonstrates significant potential to improve mobility, safety, and overall quality of life for people with visual impairments.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
12
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Related Documents

JOURNAL ARTICLE

Object Tracking by Detection using YOLO and SORT

Heet ThakkarNoopur TambeSanjana ThamkeVaishali K. Gaidhane

Journal:   International Journal of Scientific Research in Computer Science Engineering and Information Technology Year: 2020 Pages: 224-229
JOURNAL ARTICLE

Real Time Object Detection Using YoloReal Time Object Detection Using Yolo

I Ankith

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2021 Vol: 9 (11)Pages: 1504-1511
JOURNAL ARTICLE

Real Time Object Detection using YOLO Algorithm

I.V.S.L HarithaM. HarshiniShruti PatilJeethu Philip

Journal:   2022 6th International Conference on Electronics, Communication and Aerospace Technology Year: 2022 Pages: 1465-1468
© 2026 ScienceGate Book Chapters — All rights reserved.