In this paper we present a multi-font OCR system to be employed for document processing, which performs, at the same time, both the character recognition and the font-style detection of the digits belonging to a subset of the existing fonts. The detection of the font-style of the document words can guide a rough automatic classification of documents, and can also be used to improve the character recognition. The system uses the tangent distance as a classification function in a nearest neighbour approach. We have to discriminate among different digits and, for the same character, we have to discriminate among different font-styles. The nearest neighbour approach is always able to recognize the digit, but the performance in font detection is not optimal. To improve the performance of the system, we have used a discriminant model, the TD-Neuron, which is employed to discriminate between two similar classes. Some experimental results and prospective use in document processing applications are presented.
Liangrui PengChangsong LiuXiaoqing DingJian‐Ming JinYoushou WuHua WangYanhua Bao
Liangrui PengChangsong LiuXiaoqing DingHua WangJian‐Ming Jin
Hasan S. M. Al-KhaffafNadia A. Musa
K. SalámaV. SelvamanickamD. F. Lee