OCR reading technology is benefited by the evolution of high-powered desktop computing allowing for the development of more powerful recognition software that can read a variety of common printed fonts and handwritten texts. But still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results. This paper describes an OCR system for printed text documents in Malayalam, a language of the South Indian State, Kerala. The input to the system would be the scanned image of a page of text and the output is a machine editable file. Initially, the image is preprocessed to remove noise and skew. Lines, words and characters are segmented from the processed document image. The proposed method uses wavelet multi-resolution analysis for the purpose of extracting features and Feed Forward Back-propagation Neural Network to accomplish the recognition tasks.
Shyla AfrogeBoshir AhmedFiroz Mahmud
Shrinivas R. ZanwarAbbhilasha S. NaroteS. P. Narote