Pyramid Vision Transformer-based COVID-19 Detection Using Self-supervised Learning and Pretraining on Chest X-ray Images

Eshika Jain; Pratham Kaushik; Vinay Kukreja; Vandana Ahuja; Ayush Dogra; Ankit Bansal

doi:10.2174/0126667975389675250429173512

ScienceGate Book Chapters

JOURNAL ARTICLE

Pyramid Vision Transformer-based COVID-19 Detection Using Self-supervised Learning and Pretraining on Chest X-ray Images

Eshika Jain Pratham Kaushik Vinay Kukreja Vandana Ahuja Ayush Dogra Ankit Bansal

Year: 2025 Journal: Coronaviruses Vol: 06

DOI: 10.2174/0126667975389675250429173512

Get Full-Text PDF Get Analytical Report

Abstract

Introduction/Background: COVID-19 remains a public health emergency, necessitating rapid and accurate diagnostic techniques. Chest X-ray imaging is a low-cost, widely used technique for the detection of COVID-19, but its interpretation by humans is laborious and prone to errors. In this study, we propose an automated detection of COVID-19 on chest X-rays using a Pyramid Vision Transformer (PVT) model with self-supervised learning (SSL), pre-training, and attention map visualization. The proposed method has the potential to be more accurate, interpretable, and efficient and thus clinically suitable. Materials and Methods: In this study, the COVID-19 Chest X-Ray Database on Kaggle, which comprises 36,116 images classified as normal, viral pneumonia, and COVID-19, was employed. Largescale preprocessing operations, including the resizing, normalizing, and data augmentation operations, were carried out to generalize these models. Pretraining and fine-tuning the PVT model on the dataset included SSL, dropout regularization, and attention mechanisms. The primary metrics considered during the evaluation were the Measurement of Lung Severity Score (LSS), segmentation accuracy, Severity Detection Precision (SDP), Detection Sensitivity of Opacity (ODS), Time-to-Severity Detection (TSD), and focal AUC-ROC score. Results: Fine-tuning the PVT model significantly improved performance across multiple metrics. LSS increased from 15% (pretrained) to 17% (fine-tuned), while segmentation accuracy improved from 88% to 91%. Dropout regularization slightly reduced LSS to 16% but enhanced SDP (80% to 90%) and ODS (78% to 85%). TSD decreased from 4.5s (pretrained) to 3.8s (fine-tuned), improving detection speed. The focal AUC-ROC score improved from 0.92 to 0.95 with fine-tuning and dropout, while the Misclassification Visualization Score (MVS) increased from 0.85 to 0.91, reducing misclassification rates. Data augmentation further enhanced accuracy (88% to 94%), precision (85% to 91%), and recall (83% to 90%). Discussion: This study demonstrates the effectiveness of SSL pretraining, dropout regularization, and data augmentation in improving COVID-19 detection performance. The significant improvements in precision, recall, and robustness highlight the model';s potential for clinical deployment. Attention map visualizations further enhance trust and interpretability by illustrating key lung regions that the model focuses on, ensuring transparency in decision-making. Conclusion: The PVT-based model, integrated with SSL, fine-tuning, and attention mechanisms, provides a robust, interpretable, and efficient solution for COVID-19 detection from chest X-ray images. The results validate its potential for real-world clinical use, offering improved diagnostic accuracy, reduced misclassification, and enhanced detection speed.

Keywords:

Coronavirus disease 2019 (COVID-19) Artificial intelligence Pyramid (geometry) Computer vision Transformer Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Computer science 2019-20 coronavirus outbreak Medicine Engineering Virology Optics Physics Infectious disease (medical specialty) Internal medicine Electrical engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.19

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

COVID-19 diagnosis using AI

Health Sciences → Medicine → Radiology, Nuclear Medicine and Imaging

AI in cancer detection

Physical Sciences → Computer Science → Artificial Intelligence

Brain Tumor Detection and Classification

Life Sciences → Neuroscience → Neurology

Pyramid Vision Transformer-based COVID-19 Detection Using Self-supervised Learning and Pretraining on Chest X-ray Images

Abstract

Metrics

Topics

Related Documents

COVID-19 detection based on self-supervised transfer learning using chest X-ray images

Self-supervised Pretraining for Covid-19 and Other Pneumonia Detection from Chest X-ray Images

Vision Transformer-Based Federated Learning for COVID-19 Detection Using Chest X-Ray

Self-supervised Learning for COVID-19 Detection from Chest X-ray Images

Vision Transformer-based Automatic Detection of COVID-19 in Chest X-ray Images