JOURNAL ARTICLE

Rectification of Camera-Captured Document Images with Mixed Contents and Varied Layouts

Abstract

This paper focuses on the rectification of camera-captured document images with varied layouts of mixed contents. Document images acquired via cameras, including smartphones, are typically plagued by perspective, geometric, and/or rotational distortion that hinders document analysis processes. In this paper, we propose an approach to camera-captured image rectification of text and non-text regions that handles perspective, geometric and rotational distortions present in planar and curled documents, extending a state-of-the-art content-based rectification method. We define surface projections via a three-tiered local transformation model, in which primary curved surface projections are formed from individual text regions, and secondary and tertiary surface projections are formed from non-text regions, resulting in a 'patchwork' combination of surfaces spanning the document image. This transformation model allows us to process document images with varied layouts of mixed contents, including large images and graphics, that also contain some justified text. Experiments and comparisons with a state-of-the-art content-based rectification approach on the public IUPR dataset demonstrate the value of the proposed approach on two levels: 1) a significantly improved rectification performance using standard optical character recognition metrics, along with increased document readability, and 2) an improved range of applicability, i.e. ability to correct document images showing various layouts and content types.

Keywords:
Perspective distortion Rectification Computer science Distortion (music) Artificial intelligence Computer vision Image rectification Perspective (graphical) Transformation (genetics) Document layout analysis Readability Graphics Computer graphics (images) Geometric transformation Information retrieval Pattern recognition (psychology) Image (mathematics) Physics

Metrics

1
Cited By
0.11
FWCI (Field Weighted Citation Impact)
30
Refs
0.43
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Computer Graphics and Visualization Techniques
Physical Sciences →  Computer Science →  Computer Graphics and Computer-Aided Design
© 2026 ScienceGate Book Chapters — All rights reserved.