JOURNAL ARTICLE

Text extraction from web images

Changsong LiuCheng YangXiaoqing DingJian Fan

Year: 2011 Journal:   Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE Vol: 7879 Pages: 78790P-78790P   Publisher: SPIE

Abstract

Web images constitute an important part of web document and become a powerful medium of expression, especially for the images containing text. The text embedded in web images often carry semantic information related to layout and content of the pages. Statistics show that there is a significant need to detect and recognize text from web images. In this paper, we first give a short review of these methods proposed for text detection and recognition in web images; then a framework to extract from web images is presented, including stages of text localization and recognition. In text localization stage, localization method is applied to generate text candidates and a two-stage strategy is utilized to select text candidates, then text regions are localized using a coarse-to-fine text lines extraction algorithm. For text recognition, two text region binarization methods have been proposed to improve the performance of text recognition in web images. Experimental results for text localization and recognition prove the effectiveness of these methods. Additionally, a recognition evaluation for text regions in web images has been conducted for benchmark.

Keywords:
Computer science Text recognition Web page Artificial intelligence Information retrieval Benchmark (surveying) Image (mathematics) Feature extraction Text mining Noisy text analytics Text graph Pattern recognition (psychology) Natural language processing World Wide Web

Metrics

2
Cited By
0.00
FWCI (Field Weighted Citation Impact)
25
Refs
0.08
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

Text Extraction from Complex Natural Images

Manoj KumarGuee-Sang Lee

Journal:   International Journal of Contents Year: 2010 Vol: 6 (2)Pages: 1-5
JOURNAL ARTICLE

Text Extraction from Complex Background Images

Chao LiuFei Peng DaChenxing Wang

Journal:   Advanced materials research Year: 2013 Vol: 765-767 Pages: 975-979
JOURNAL ARTICLE

Text extraction from color map images

Hang Wang

Journal:   Journal of Electronic Imaging Year: 1994 Vol: 3 (4)Pages: 390-390
JOURNAL ARTICLE

Text Extraction from Images Using OCR

K Tejaswini Jyothi E

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2020 Vol: 8 (5)Pages: 1805-1810
© 2026 ScienceGate Book Chapters — All rights reserved.