JOURNAL ARTICLE

A System for Text Extraction in Complex-Background Document Images

Abstract

Due to the demand of information transportation, identification, archive, the digitization of document images is increasingly concerned. Detecting text regions is the first and crucial step in End-to-End text recognition system. With the complex background document images, they are still a challenging problem due to the variety of fonts, sizes, colors of the text, and background complexity. This paper presents a system based on a Connectionist Text Proposal Network (CTPN) for extracting text regions in the document image with a complex background. This method consists of two fundamental stages: detect fine-scale text and text line extraction based on the obtained text components. We tried many-core of the feature extracting method such as VGG19, Resnet50 as well as evaluate the system's performance on many different datasets such as ICDAR2011, ICDAR2013, and a private real book cover. Besides, we also built an online visualize evaluation system to compare the results.

Keywords:
Computer science Digitization Artificial intelligence Document layout analysis Feature extraction Information retrieval Optical character recognition Variety (cybernetics) Identification (biology) Text detection Feature (linguistics) Natural language processing Image (mathematics) Computer vision

Metrics

3
Cited By
0.11
FWCI (Field Weighted Citation Impact)
25
Refs
0.49
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing and 3D Reconstruction
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Text Extraction from Complex Background Images

Chao LiuFei Peng DaChenxing Wang

Journal:   Advanced materials research Year: 2013 Vol: 765-767 Pages: 975-979
JOURNAL ARTICLE

Extraction of Text Regions from Complex Background in Document Images by Multilevel Clustering

Hoai Nam VuTuan Anh TranIn Seop NaSoo Hyung Kim

Journal:   ˜The œInternational journal of networked and distributed computing Year: 2016 Vol: 4 (1)Pages: 11-11
BOOK-CHAPTER

Text Extraction from Mail Images with Complex Background

Qingqing WangXiao TuShujing LuYue Lu

Communications in computer and information science Year: 2018 Pages: 3-11
JOURNAL ARTICLE

Text Extraction in Complex Color Document Images for Enhanced Readability

P. NagabhushanS. Nirmala

Journal:   Intelligent Information Management Year: 2010 Vol: 02 (02)Pages: 120-133
© 2026 ScienceGate Book Chapters — All rights reserved.