JOURNAL ARTICLE

CONVERSION OF TEXT IMAGE TO AUDIO FOR VISUALLY IMPAIRED PEOPLE USING CNN ALGORITHM

Abstract

The integration of text-to-speech (TTS) and optical character recognition (OCR) is revolutionizing various applications, particularly for individuals with visual impairments.This powerful framework enables seamless interaction between users and computers through speech.By leveraging OCR technology, a text-to-speech program scans images and accurately converts more than 38 languages and numerical characters into spoken words.The project comprises two essential modules: a speech processing module and an image processing module.Previous approaches such as the pointy method, linked element method, feel-based method, and mathematical morphology method have been employed, but they do have limitations when it comes to accuracy in assessment, rating, and review.Captured text associated with photographs, sourced from magazines, newspapers, and banners, has become increasingly valuable in conveying shared information, enhancing accessibility, and driving innovation in areas such as employment, earnings, efficiency, and problem-solving.The project incorporates a voice processing module and a photo processing module, utilizing various methods including the pointy approach, connected factor approach, texture-based approach, and mathematical morphology technique.However, these methods also face limitations in terms of accuracy for assessment, rating, and evaluation.Advancements in technology, particularly in machine learning algorithms and artificial intelligence, play a vital role in developing cutting-edge smart systems worldwide.Within this context, an automated strategy for text detection and recognition in natural scenes using an optical character recognition (OCR) function is gaining prominence.Additionally, a suitable speech synthesizer is employed to pronounce the extracted text.A prototype of the proposed system has been developed and its functionality confirmed through an experimental setup.These findings underscore the system's underlying principles and demonstrate its potential in creating advanced assistive technology for visually impaired individuals.

Keywords:
Computer science Artificial intelligence Computer vision Image (mathematics) Visually impaired Speech recognition Pattern recognition (psychology) Algorithm Computer graphics (images) Human–computer interaction

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
8
Refs
0.06
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.