The conversion of image-based documents into digital and processible forms can be accomplished quite successfully with optical character recognition (OCR) tools. However, there are still problems with preserving the format on the original document. An important one of these problems is the reading of the tabular data. In this paper, a method is proposed in which the tabular data contents of hard-copy documents is extracted from the text and character positions which are obtained from an OCR tool and transferred to digital forms. The performance of the method is measured by the number of detected rows and columns and presented with the results of other commercial products.
T. Kameswara RaoK. Yashwanth ChowdaryI. Koushik ChowdaryK. Prasanna KumarCh. Ramesh
Nikhil KushwahaOm AsatiMainak Sadhya
AngelJean Jisha .MVijayalakshmi Shivkhumar
AngelJean Jisha .MVijayalakshmi Shivkhumar