Text is rich in information. Scene text detection is still a challenging problem of machine vision due to variations such as script, font, color, scale, lighting, angle of view and other distortions present in the scene. Scene text reading generally requires high-performance computation platform, large training dataset and longer training process. We have attempted to train our auto encoder based text detector to precisely localize text with minimum training on a small dataset and limited computational resources. The idea involves computation of principal component analysis of image, morphological gradient to enhance text on the scene image and to feed it to a gradient auto encoder neural network to locate possible text components. Scripts belonging to multiple languages can be detect by the proposed detector and it is fairly robust against the variations such as color, lighting, scale, orientation and font. The proposed method is trained with only 167 training images of MRRC dataset. Experiments show that the method achieves an F-measure of 0.76 and 0.77 on MRRC dataset and MSRA-TD500 dataset respectively.
Xu YangKaihua TangHanwang ZhangJianfei Cai
Georgios TzimiropoulosStefanos ZafeiriouMaja Pantić
Xu YangHanwang ZhangJianfei Cai
Robert MahonyUwe HelmkeJ.B. Moore