JOURNAL ARTICLE

Fast Data Augmentation for Scene Text Recognition Using CUDA

Abstract

Scene Text Recognition (STR) is a task in computer vision that is used to read texts in natural scene images. STR currently suffers from data distribution shift due to the lack of large real datasets for training. Data augmentation is a method that has been used in multiple studies to address this issue. However, performing augmentation also introduces computational overhead during training. In this paper, we propose FastSTRAug, a CUDA-based library of 36 augmentation functions specifically designed for STR. When executed through varying image sizes, FastSTRAug is observed to be significantly faster over its serial counterpart in most functions, reaching up to 380x speedup on larger images.

Keywords:
Computer science CUDA Speedup Task (project management) Overhead (engineering) Artificial intelligence Training set Computer vision Computer graphics (images) Pattern recognition (psychology) Parallel computing Operating system

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
27
Refs
0.16
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing and 3D Reconstruction
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.