DBCAN: Dual-Branch Cross-Attention Network for Scene Text Recognition

Xinjian Gao; Ye Pang; Yuyu Liu; Jun Yu; Maokun Han; Kai Hou; Wei Wang

doi:10.1109/icme52920.2022.9859826

ScienceGate Book Chapters

JOURNAL ARTICLE

DBCAN: Dual-Branch Cross-Attention Network for Scene Text Recognition

Xinjian Gao Ye Pang Yuyu Liu Jun Yu Maokun Han Kai Hou Wei Wang

Year: 2022 Journal: 2022 IEEE International Conference on Multimedia and Expo (ICME) Pages: 1-6

DOI: 10.1109/icme52920.2022.9859826

Get Full-Text PDF Get Analytical Report

Abstract

Scene text recognition, especially irregular text recognition, is a challenging task due to the large variance in text appearance. Although some existing methods have achieved state-of-the-art performance with the attention-based encoder-decoder framework, they always perform poorly on some challenging text such as severely curved, blurred, and incomplete-semantic text. To address these issues, we propose a Dual-Branch Cross-Attention Network (DBCAN). Different from the previous methods heavily relying on semantic information, DBCAN can enhance the position clues and learn semantic relations with two separate branches and fuse them by a tailored Cross-Attention Module (CAM). Furthermore, a Convolution-Based 2D Positional Embedding (CBPE) is introduced to describe the 2D spatial dependencies of characters. Extensive experiments demonstrate our DBCAN is more accurate and robust than the previous methods and achieves state-of-the-art performance on several benchmarks, particularly CUTE (93.4%). Our code is made publicly available at https://github.com/GaoXinJian-USTC/DBCAN.

Keywords:

Computer science Encoder Artificial intelligence Embedding Dual (grammatical number) Task (project management) Convolution (computer science) Fuse (electrical) Code (set theory) Pattern recognition (psychology) Natural language processing Artificial neural network

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.07

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Processing and 3D Reconstruction

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

DBCAN: Dual-Branch Cross-Attention Network for Scene Text Recognition

Abstract

Metrics

Topics

Related Documents

Dual-branch Attention Detection Network for Scene Text Detection

Scene Text Recognition with Cascade Attention Network

Context Attention Network for Scene Text Recognition

Scene Text Recognition via Dual-path Network with Shape-driven Attention Alignment

Dual Relation Network for Scene Text Recognition