Automated Text-to-Audio Conversion for Visually Impaired People Using Optical Character Recognition

Shahin Kamali; V. Malathy; G Uma Devi; S S Deepa; M. Anand; S. Vaishnodevi; Ratchagaraja Dhairiyasamy; Subhav Singh

doi:10.47857/irjms.2025.v06i02.03672

ScienceGate Book Chapters

JOURNAL ARTICLE

Automated Text-to-Audio Conversion for Visually Impaired People Using Optical Character Recognition

Shahin Kamali V. Malathy G Uma Devi S S Deepa M. Anand S. Vaishnodevi Ratchagaraja Dhairiyasamy Subhav Singh

Year: 2025 Journal: International Research Journal of Multidisciplinary Scope Vol: 06 (02)Pages: 992-1008

DOI: 10.47857/irjms.2025.v06i02.03672

Get Full-Text PDF Get Analytical Report

Abstract

This work aims to get text from images and documents like Portable Document Format (PDF) and PowerPoint Presentation (PPT) using Optical Character Recognition (OCR). The text is turned into speech, and thus, audio files are received. Organizing these audio files in a specific folder makes it easier to find and listen to them. The work plan is to create a tool that can take documents, PDFs, or PPT files as input and extract letters and numbers from them. This tool is great for quickly entering data from printed documents. Many images are used as input for the tool, which uses a machine to find patterns in the images and extract characters. Python is the main tool used for this work. A Python wrapper for Tesseract is used to test OCR on images first to make sure it works well. Then, the solution is used with a live video feed from a smartphone, processed with OpenCV. The text obtained is then turned into speech using Google Text-To-Speech (gTTS). With this approach, the system can read any text it finds out loud. By combining image processing, OCR, and text-to-speech, the system aims to make it easy and enjoyable to listen to text.

Keywords:

Visually impaired Character (mathematics) Computer science Optical character recognition Speech recognition Character recognition Audio visual Artificial intelligence Natural language processing Computer vision Human–computer interaction Multimedia Image (mathematics) Mathematics

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.18

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Vehicle License Plate Recognition

Physical Sciences → Engineering → Media Technology

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Subtitles and Audiovisual Media

Social Sciences → Arts and Humanities → Language and Linguistics

Automated Text-to-Audio Conversion for Visually Impaired People Using Optical Character Recognition

Abstract

Metrics

Topics

Related Documents

Text to Speech Conversion using Optical character Recognition for Visually Impaired Persons

CONVERSION OF TEXT IMAGE TO AUDIO FOR VISUALLY IMPAIRED PEOPLE USING CNN ALGORITHM

Effective Shopping Method for Visually Impaired People using Optical Character Recognition

Smart Reader for Visually Impaired People Based on Optical Character Recognition

Image Text to Speech Conversion Using Optical Character Recognition