Entropy-Based Selection of Cluster Representatives for Document Image Compression

Luis F. Mun͂oz-Pérez; José Antonio Esquivel Guerrero; Jorge E. Macías‐Díaz

doi:10.1137/19m1243312

ScienceGate Book Chapters

JOURNAL ARTICLE

Entropy-Based Selection of Cluster Representatives for Document Image Compression

Luis F. Mun͂oz-Pérez José Antonio Esquivel Guerrero Jorge E. Macías‐Díaz

Year: 2019 Journal: SIAM Journal on Imaging Sciences Vol: 12 (4)Pages: 1720-1738 Publisher: Society for Industrial and Applied Mathematics

DOI: 10.1137/19m1243312

Get Full-Text PDF Get Analytical Report

Abstract

In this work, we introduce an efficient method for lossy compression of digitalized documents. The method uses a dictionary which consists of class representatives defined using a minimum entropy criterion. The algorithm initially identifies the different symbols contained in a document image, and then the symbols are grouped in classes by means of a hierarchic clustering algorithm. For each class, a representative is selected using the principle of minimum entropy and suitable similarity distances. The technique creates a file in which every object belonging to a class is replaced by its class representative. Finally, the resulting file is compressed. The performance of the proposed algorithm is assessed using digitized files from a standard database for document compression along with different resolutions. Comparisons against other state-of-the-art algorithms are performed in this manuscript. The results establish quantitatively that the present methodology is a more efficient technique.

Keywords:

Lossy compression Computer science Entropy (arrow of time) Cluster analysis Image compression Lossless compression Data compression Pattern recognition (psychology) Data mining Algorithm Artificial intelligence Image (mathematics) Image processing

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.12

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Advanced Data Compression Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Entropy-Based Selection of Cluster Representatives for Document Image Compression

Abstract

Metrics

Topics

Related Documents

Entropy-based pattern matching for document image compression

Cluster-Based Sample Selection for Document Image Binarization

Entropy Based Cluster Selection

Image Analogy Based Document Image Compression

Tiny Entropy Based Image Compression