A multi-font OCR system for printed Telugu text

C. Vasantha Lakshmi; C. Patvardhan

doi:10.1109/lec.2002.1182284

ScienceGate Book Chapters

JOURNAL ARTICLE

A multi-font OCR system for printed Telugu text

C. Vasantha Lakshmi C. Patvardhan

Year: 2003 Pages: 7-17

DOI: 10.1109/lec.2002.1182284

Get Full-Text PDF Get Analytical Report

Abstract

This work describes the design and development of a Telugu Optical Character Recognition system for printed text (TOSP). Pre-processing tasks considered in this paper are: Conversion of a grey scale image to a binary image, image rectification, skew detection and removal, segmentation of text into lines, words and basic symbols. Basic symbols are identified as the fundamental unit of segmentation in this paper which are recognized by the recognizer. The combinations of these basic symbols that together form characters and compound characters of Telugu are also determined to complete the recognition process. The special feature of TOSP is that it is designed to handle multiple sizes and multiple fonts. Further, the output produced by TOSP can directly be opened in any Indian language software that supports transliteration facility into Telugu script and edited. Several such softwares are popular and available.

Keywords:

Telugu Optical character recognition Computer science Artificial intelligence Font Feature (linguistics) Skew Software Segmentation Natural language processing Transliteration Speech recognition Image segmentation Process (computing) Pattern recognition (psychology) Image (mathematics)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.06

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Processing and 3D Reconstruction

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

A multi-font OCR system for printed Telugu text

Abstract

Metrics

Citation History

Topics

Related Documents

An optical character recognition system for printed Telugu text

A high accuracy OCR system for printed Telugu text

Multilingual Translational Optical Character Recognition System for Printed Telugu Text

Multi-font Telugu Text Recognition Using Hidden Markov Models and Akshara Bi-grams

Multi-font printed Mongolian document recognition system