Due to the demand of information transportation, identification, archive, the digitization of document images is increasingly concerned. Detecting text regions is the first and crucial step in End-to-End text recognition system. With the complex background document images, they are still a challenging problem due to the variety of fonts, sizes, colors of the text, and background complexity. This paper presents a system based on a Connectionist Text Proposal Network (CTPN) for extracting text regions in the document image with a complex background. This method consists of two fundamental stages: detect fine-scale text and text line extraction based on the obtained text components. We tried many-core of the feature extracting method such as VGG19, Resnet50 as well as evaluate the system's performance on many different datasets such as ICDAR2011, ICDAR2013, and a private real book cover. Besides, we also built an online visualize evaluation system to compare the results.
Chao LiuFei Peng DaChenxing Wang
Hoai Nam VuTuan Anh TranIn Seop NaSoo Hyung Kim
Qingqing WangXiao TuShujing LuYue Lu
Ruiqing WuHang ZengJun XieHang HaoQingshui GuWei Chen