JOURNAL ARTICLE

Reading Scene Text in Deep Convolutional Sequences

Pan HeWeilin HuangYu QiaoChen Change LoyXiaoou Tang

Year: 2016 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 30 (1)   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

We develop a Deep-Text Recurrent Network (DTRN)that regards scene text reading as a sequence labelling problem. We leverage recent advances of deep convolutional neural networks to generate an ordered highlevel sequence from a whole word image, avoiding the difficult character segmentation problem. Then a deep recurrent model, building on long short-term memory (LSTM), is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently. Our model has a number of appealing properties in comparison to existing scene text recognition methods: (i) It can recognise highly ambiguous words by leveraging meaningful context information, allowing it to work reliably without either pre- or post-processing; (ii) the deep CNN feature is robust to various image distortions; (iii) it retains the explicit order information in word image, which is essential to discriminate word strings; (iv) the model does not depend on pre-defined dictionary, and it can process unknown words and arbitrary strings. It achieves impressive results on several benchmarks, advancing the-state-of-the-art substantially.

Keywords:
Computer science Artificial intelligence Convolutional neural network Leverage (statistics) Natural language processing Deep learning Context (archaeology) Sequence (biology) Character (mathematics) Pattern recognition (psychology) Word (group theory) Recurrent neural network Feature (linguistics) Artificial neural network Linguistics

Metrics

317
Cited By
19.92
FWCI (Field Weighted Citation Impact)
45
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Reading Scene Text with Aggregated Temporal Convolutional Encoder

Tianlong MaXiangcheng DuXingjiao WuZhao ZhouYingbin ZhengCheng Jin

Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Year: 2023 Vol: 22 (11)Pages: 1-16
JOURNAL ARTICLE

Reading scene text with fully convolutional sequence modeling

Yunze GaoYingying ChenJinqiao WangMing TangHanqing Lu

Journal:   Neurocomputing Year: 2019 Vol: 339 Pages: 161-170
JOURNAL ARTICLE

Enhanced Scene Text Extraction through “Texture Analysis and Deep Convolutional Networks"

Shilpi Rani

Journal:   Journal of Information Systems Engineering & Management Year: 2025 Vol: 10 (54s)Pages: 60-69
© 2026 ScienceGate Book Chapters — All rights reserved.