Zhen SongHuanshui ZhangPeng Cui
In this paper, we propose a new algorithm that conjointly address scene text detection and recognition by sharing convolutional feature map. Compared with most systems which consider text detection and recognition to be irrelevant tasks, we integrate text detection and recognition into an end-to-end trainable neural network based on convolutional and recurrent neural network It can precisely detect and recognize text during a simple forward propagation, avoiding redundant processes like image patch cropping, repeated calculation of feature map. We train this unified neural network just by images and corresponding ground truth bounding boxes and text labels. Our algorithm gains outstanding performance in terms of computation time and accuracy on standard benchmark datasets. The proposed model runs robustly on multi-ratios images without complicated post-processing steps.
Guangcun WeiWansheng RongYongquan LiangXinguang XiaoXiang Liu
Mengjie ZhongXihan WangLian-He ShaoQuanli Gao
Qiao LiangSanli TangZhanzhan ChengYunlu XuYi NiuShiliang PuFei Wu