JOURNAL ARTICLE

Scene Text Recognition with Multi-Encoders

Yao WangJong-Eun Ha

Year: 2022 Journal:   2022 22nd International Conference on Control, Automation and Systems (ICCAS) Pages: 1615-1620

Abstract

Although text recognition has significantly evolved over the years, the current models still have huge challenges, especially for irregular text images, such as complex backgrounds, curved text, diverse fonts, distortions, etc. Currently, CNN-based text recognition networks have shown good performance but still face the above challenges. Recently, feature extractor based on transformer has shown excellent advantages for global feature extraction on images. Especially in irregular text images, which can use self-attention to establish the information connection of each part of the image, which can also reduce the influence of the irregular distribution of characters. Therefore, this paper proposes MESTR(Multi-Encoders Scene Text Recognition) that combines a CNN-based [1] [2] [6] feature extractor and a transformer-based feature extractor. MESTR can extract local and global features of text images at the same time and then integrate global features into local features. During training, we used CTC [6] as guide training in the decoder part, as the compensation training strategy for attentional decoder. Experimental results demonstrate that the proposed MESTR shows competitive results on all seven benchmarks. At the same time, we provide ablation experiments to show the effectiveness of the improved part on the text recognition model.

Keywords:
Computer science Feature extraction Artificial intelligence Extractor Encoder Text recognition Feature (linguistics) Pattern recognition (psychology) Transformer Facial recognition system Text detection Speech recognition Computer vision Image (mathematics) Engineering

Metrics

1
Cited By
0.07
FWCI (Field Weighted Citation Impact)
31
Refs
0.34
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing and 3D Reconstruction
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Vehicle License Plate Recognition
Physical Sciences →  Engineering →  Media Technology

Related Documents

JOURNAL ARTICLE

Scene Text Recognition With Dual Encoders

Yao WangJong-Eun Ha

Journal:   Journal of Institute of Control Robotics and Systems Year: 2023 Vol: 29 (12)Pages: 973-979
JOURNAL ARTICLE

Scene Text Recognition with Multi-decoders

Yao WangJong-Eun Ha

Journal:   2021 21st International Conference on Control, Automation and Systems (ICCAS) Year: 2021 Pages: 1523-1528
BOOK-CHAPTER

Scene Text Detection with Gradient Auto Encoders

S. RaveeshwaraB. H. Shekar

Communications in computer and information science Year: 2023 Pages: 350-361
JOURNAL ARTICLE

Scene Text Recognition with Transformer using Multi-patches

Yao WangJong-Eun Ha

Journal:   Journal of Institute of Control Robotics and Systems Year: 2022 Vol: 28 (10)Pages: 862-867
JOURNAL ARTICLE

Multi-scene ancient chinese text recognition

Kaili WangYaohua YiJunjie LiuLiqiong LuYing Song

Journal:   Neurocomputing Year: 2019 Vol: 377 Pages: 64-72
© 2026 ScienceGate Book Chapters — All rights reserved.