Veronica NaosekpamAilneni Sai ShishirNilkanta Sahu
Text recognition from natural scene images is an arduous task due to non-horizontal shaped text caused by perspective distortion and low camera angle. This work presents a framework for natural scene text recognition that includes a text orientation correction module based on Inverse Compositional - Spatial Transformer Network (IC-STN) [1]. The text recognition module is made up of an encoder-decoder attentional sequential network. The IC-STN is a differentiable module which can be trained using the error that is backpropagated by the text recognition network. The input image's orientation is corrected during testing using the IC-STN module that determines the parameters of image transformation techniques such as affine and homographic transformation. This, in turn, produces a more easily recognizable form of the text to be used by the sequence recognition network. As per our knowledge, this is the first time where IC-STN has been incorporated in the scene text recognition pipeline. Experimental analysis conducted on the benchmark datasets, ICDAR 2013 and CUTE80, proved that the proposed work performs better than the existing state-of-the-art text recognition systems.
Gang WangHua ping ZhangJian yun Shang
Baoguang ShiXinggang WangPengyuan LyuCong YaoXiang Bai
Wenjun KeJianguo WeiQingzhi HouHui Feng
Zhaowei CaiEnqi ZhanSui LeiYu WangJian Zhou
Yi ZhangZhiwen LiLei GuoWenbi Rao