JOURNAL ARTICLE

DSNet: A End‐to‐End Scene Text Spotting Network With Dual‐Stream Feature Fusion

Mengjie ZhongXihan WangLian-He ShaoQuanli Gao

Year: 2025 Journal:   Electronics Letters Vol: 61 (1)   Publisher: Institution of Engineering and Technology

Abstract

ABSTRACT End‐to‐end scene text spotting has attracted considerable academic interest in recent years. However, due to complex environmental factors, text recognition remains a formidable challenge. In this paper, we introduce an end‐to‐end scene text spotting framework, referred to as DSNet. This framework comprises two principal modules: the text feature enhancement module (TFEM) for enhancing text regions and the redundant feature suppression module (RFSM) for noise suppression. Within the TFEM, we have designed multiple transformer layers for feature encoding; these layers are utilized to extract and enhance the feature representation of the text region. In the RFSM, we have designed a spatial reconstruction unit (SRU) and a channel reconstruction unit (CRU); these units effectively suppress irrelevant information through the feature reconstruction process. The proposed framework jointly optimizes text features by operating the TFEM and RFSM in parallel. The fused features from both modules are subsequently input to the decoder, enabling precise text area localization and robust character recognition. Extensive experiments demonstrate that our model achieves competitive performance in end‐to‐end scene text spotting, attaining an F‐measure of 90.2% on ICDAR2015, closely approaching the state‐of‐the‐art (91.0%).

Keywords:
Spotting Computer science Artificial intelligence Feature (linguistics) Keyword spotting Pattern recognition (psychology) End-to-end principle Feature extraction

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
22
Refs
0.19
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Vehicle License Plate Recognition
Physical Sciences →  Engineering →  Media Technology

Related Documents

JOURNAL ARTICLE

Feature Fusion Pyramid Network for End-to-End Scene Text Detection

Yirui WuLilai ZhangHao LiYunfei ZhangShaohua Wan

Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Year: 2023 Vol: 23 (11)Pages: 1-16
JOURNAL ARTICLE

Scene text spotting based on end-to-end

Guangcun WeiWansheng RongYongquan LiangXinguang XiaoXiang Liu

Journal:   Journal of Intelligent & Fuzzy Systems Year: 2021 Vol: 40 (5)Pages: 8871-8881
© 2026 ScienceGate Book Chapters — All rights reserved.