JOURNAL ARTICLE

Better Skeleton Better Readability: Scene Text Image Super-Resolution via Skeleton-Aware Diffusion Model

Shrey SinghPrateek KeserwaniPartha Pratim RoyRajkumar Saini

Year: 2024 Journal:   IEEE Access Vol: 12 Pages: 187640-187651   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Scene text image super-resolution (STISR) aims to enhance the resolution of text images while simultaneously improving their readability by reducing noise, blur, and other degradations. Existing diffusion-based approaches for STISR primarily rely on text-prior information but often overlook the importance of explicitly modeling the visual structure of the text. In this paper, we propose a novel Skeleton-Aware Diffusion Method (SADM) for STISR, which introduces text skeletons as structural guidance to the diffusion process. The text skeleton serves as a critical visual cue, helping the model to better restore the fine details of text, even in severely degraded low-resolution images. Generating high-quality skeletons from low-resolution scene text is a challenging task due to the inherent blurring and noise present in such images. To tackle this, we introduce a diffusion-based Skeleton Correction Network (SCN), which refines the initial skeletons produced by a convolutional neural network-based skeletonization model. The SCN effectively improves the accuracy of the skeletons, allowing for more precise structural guidance during the diffusion process. Our extensive experiments demonstrate the significant benefits of incorporating skeleton information into the STISR pipeline. The proposed SADM achieves state-of-the-art performance on the TextZoom dataset, with accuracies of 81.4%, 64.9%, and 49.6% on the easy, medium, and hard subsets, respectively, compared to the previous best results by ASTER text recognizer. Through detailed analysis, we also show that improving the quality of skeletons from low-resolution images leads to better super-resolution outcomes and enhances the performance of text recognizers.

Keywords:
Skeleton (computer programming) Readability Computer science Computer vision Artificial intelligence Image (mathematics) Resolution (logic) Image resolution Computer graphics (images) Pattern recognition (psychology)

Metrics

2
Cited By
1.23
FWCI (Field Weighted Citation Impact)
42
Refs
0.77
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology

Related Documents

JOURNAL ARTICLE

Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

Jingye ChenHaiyang YuJianqi MaBin LiXiangyang Xue

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2022 Vol: 36 (1)Pages: 285-293
JOURNAL ARTICLE

T‐Skeleton: Accurate scene text detection via instance‐aware skeleton embedding

Haiyan LiXingfei HuHongtao Lu

Journal:   IET Image Processing Year: 2024 Vol: 18 (6)Pages: 1491-1503
© 2026 ScienceGate Book Chapters — All rights reserved.