CONVERSION OF TEXT IMAGE TO AUDIO FOR VISUALLY IMPAIRED PEOPLE USING CNN ALGORITHM

Anne Dheeraj Chowdary; Samudrala Venkata; Sai Sritwik Sreekar; C. Antony; K Mrinalini; J Olabe; A Santos; R Martinez; E Munoz; M Martinez; A Quilis; J Bernstein; R Kavaler; R Aggarwal; M Dave; M Ostendorf; V Digalakis; O Kimball

doi:10.56726/irjmets40138

JOURNAL ARTICLE

CONVERSION OF TEXT IMAGE TO AUDIO FOR VISUALLY IMPAIRED PEOPLE USING CNN ALGORITHM

Anne Dheeraj Chowdary Samudrala Venkata Sai Sritwik Sreekar C. Antony K Mrinalini J Olabe A Santos R Martinez E Munoz M Martinez A Quilis J Bernstein R Kavaler R Aggarwal M Dave M Ostendorf V Digalakis O Kimball

Year: 2023 Journal: International Research Journal of Modernization in Engineering Technology and Science

DOI: 10.56726/irjmets40138

Get Full-Text PDF Get Analytical Report

Abstract

The integration of text-to-speech (TTS) and optical character recognition (OCR) is revolutionizing various applications, particularly for individuals with visual impairments.This powerful framework enables seamless interaction between users and computers through speech.By leveraging OCR technology, a text-to-speech program scans images and accurately converts more than 38 languages and numerical characters into spoken words.The project comprises two essential modules: a speech processing module and an image processing module.Previous approaches such as the pointy method, linked element method, feel-based method, and mathematical morphology method have been employed, but they do have limitations when it comes to accuracy in assessment, rating, and review.Captured text associated with photographs, sourced from magazines, newspapers, and banners, has become increasingly valuable in conveying shared information, enhancing accessibility, and driving innovation in areas such as employment, earnings, efficiency, and problem-solving.The project incorporates a voice processing module and a photo processing module, utilizing various methods including the pointy approach, connected factor approach, texture-based approach, and mathematical morphology technique.However, these methods also face limitations in terms of accuracy for assessment, rating, and evaluation.Advancements in technology, particularly in machine learning algorithms and artificial intelligence, play a vital role in developing cutting-edge smart systems worldwide.Within this context, an automated strategy for text detection and recognition in natural scenes using an optical character recognition (OCR) function is gaining prominence.Additionally, a suitable speech synthesizer is employed to pronounce the extracted text.A prototype of the proposed system has been developed and its functionality confirmed through an experimental setup.These findings underscore the system's underlying principles and demonstrate its potential in creating advanced assistive technology for visually impaired individuals.

Keywords:

Computer science Artificial intelligence Computer vision Image (mathematics) Visually impaired Speech recognition Pattern recognition (psychology) Algorithm Computer graphics (images) Human–computer interaction

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.06

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

CONVERSION OF TEXT IMAGE TO AUDIO FOR VISUALLY IMPAIRED PEOPLE USING CNN ALGORITHM

Abstract

Metrics

Topics

Related Documents

Automated Text-to-Audio Conversion for Visually Impaired People Using Optical Character Recognition

Image to Audio Conversion to Aid Visually Impaired People by CNN

Adaptive Algorithm Based Text to Braille Conversion for Visually Impaired People

Image to audio frequencies modulation for visually impaired people

Scene to Text Conversion and Pronunciation for Visually Impaired People