JOURNAL ARTICLE

Memory-Augmented Attention Model for Scene Text Recognition

Abstract

Natural scene text recognition is a very challenging task. Attention-based encoder-decoder framework has achieved the state-of-the-art performance. However, for some complex and/or low-quality images, the alignments estimated by the content-based attention network are not accurate enough, and so, the generated glimpse vector is also not powerful enough to represent the predicted character at current time step. To solve this problem, in the paper we propose a memory-augmented attention model for scene text recognition. The proposed memory-augmented attention network (MAAN) feeds the part of character sequence already generated and all attended alignment history to the attention model when predicting the character at current time step. The whole network can be trained end-to-end. Experimental results on several challenging benchmark datasets demonstrate that the proposed memory-augmented attention model for scene text recognition can achieve a comparable or better performance compared with state-of-the-art methods.

Keywords:
Computer science Benchmark (surveying) Task (project management) Character (mathematics) Artificial intelligence Encoder Attention network Sequence (biology) Pattern recognition (psychology) Speech recognition

Metrics

12
Cited By
0.87
FWCI (Field Weighted Citation Impact)
45
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing and 3D Reconstruction
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Sequential alignment attention model for scene text recognition

Yan WuJiaxin FanRenshuai TaoJiakai WangHaotong QinAishan LiuXianglong Liu

Journal:   Journal of Visual Communication and Image Representation Year: 2021 Vol: 80 Pages: 103289-103289
JOURNAL ARTICLE

GLaLT: Global-Local Attention-Augmented Light Transformer for Scene Text Recognition

Hui ZhangGuiyang LuoJian KangShan HuangXiao WangFei‐Yue Wang

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2023 Vol: 35 (7)Pages: 10145-10158
JOURNAL ARTICLE

Deep neural network with attention model for scene text recognition

Shuo LiMin TangQiang GuoJun LeiJun Zhang

Journal:   IET Computer Vision Year: 2017 Vol: 11 (7)Pages: 605-612
© 2026 ScienceGate Book Chapters — All rights reserved.