Text extraction from gray scale document images using edge information

Quan Yuan; C.L. Tan

doi:10.1109/icdar.2001.953803

ScienceGate Book Chapters

JOURNAL ARTICLE

Text extraction from gray scale document images using edge information

Quan Yuan C.L. Tan

Year: 2002 Pages: 302-306

DOI: 10.1109/icdar.2001.953803

Get Full-Text PDF Get Analytical Report

Abstract

In this paper we present a well designed method that makes use of edge information to extract textual blocks from gray scale document images. It aims at detecting textual regions on heavy noise infected newspaper images and separate them from graphical regions. The algorithm traces the feature points in different entities and then groups those edge points of textual regions. From using the technology of line approximation and layout categorization, it can successfully retrieve directional placed text blocks. Finally feature based connected component merging was introduced to gather homogeneous textual regions together within the scope of its bounding rectangles. We can obtain correct page decomposition with efficient computation and reduced memory size by handling line segments instead of small pixels. The proposed method has been tested on a large group of newspaper images with multiple page layouts, promising results approved the effectiveness of our method.

Keywords:

Computer science Artificial intelligence Pattern recognition (psychology) Computation Grayscale Pixel Feature extraction Bounding overwatch Categorization Enhanced Data Rates for GSM Evolution Newspaper Computer vision Information retrieval Algorithm

Metrics

Cited By

0.91

FWCI (Field Weighted Citation Impact)

Refs

0.76

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image and Object Detection Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Text extraction from gray scale document images using edge information

Abstract

Metrics

Citation History

Topics

Related Documents

Text Extraction from Document Images Using Edge Information

Text extraction from gray scale historical document images using adaptive local connectivity map

Boundary feature extraction from gray-scale document images

Text line segmentation for gray scale historical document images

Text extraction from degraded document images