JOURNAL ARTICLE

2D medical image synthesis using transformer-based denoising diffusion probabilistic model

Abstract

Abstract Objective . Artificial intelligence (AI) methods have gained popularity in medical imaging research. The size and scope of the training image datasets needed for successful AI model deployment does not always have the desired scale. In this paper, we introduce a medical image synthesis framework aimed at addressing the challenge of limited training datasets for AI models. Approach . The proposed 2D image synthesis framework is based on a diffusion model using a Swin-transformer-based network. This model consists of a forward Gaussian noise process and a reverse process using the transformer-based diffusion model for denoising. Training data includes four image datasets: chest x-rays, heart MRI, pelvic CT, and abdomen CT. We evaluated the authenticity, quality, and diversity of the synthetic images using visual Turing assessments conducted by three medical physicists, and four quantitative evaluations: the Inception score (IS), Fréchet Inception Distance score (FID), feature similarity and diversity score (DS, indicating diversity similarity) between the synthetic and true images. To leverage the framework value for training AI models, we conducted COVID-19 classification tasks using real images, synthetic images, and mixtures of both images. Main results . Visual Turing assessments showed an average accuracy of 0.64 (accuracy converging to 50 % indicates a better realistic visual appearance of the synthetic images), sensitivity of 0.79, and specificity of 0.50. Average quantitative accuracy obtained from all datasets were IS = 2.28, FID = 37.27, FDS = 0.20, and DS = 0.86. For the COVID-19 classification task, the baseline network obtained an accuracy of 0.88 using a pure real dataset, 0.89 using a pure synthetic dataset, and 0.93 using a dataset mixed of real and synthetic data. Significance . A image synthesis framework was demonstrated for medical image synthesis, which can generate high-quality medical images of different imaging modalities with the purpose of supplementing existing training sets for AI model deployment. This method has potential applications in many data-driven medical imaging research.

Keywords:
Computer science Artificial intelligence Pattern recognition (psychology) Leverage (statistics) Noise reduction Medical imaging Machine learning Data mining

Metrics

124
Cited By
38.32
FWCI (Field Weighted Citation Impact)
59
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

COVID-19 diagnosis using AI
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
Radiomics and Machine Learning in Medical Imaging
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.