JOURNAL ARTICLE

Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution

Yimin CaiYuqing LongZhenggong HanMingkun LiuYuchen ZhengWei YangLiming Chen

Year: 2023 Journal:   BMC Medical Informatics and Decision Making Vol: 23 (1)Pages: 33-33   Publisher: BioMed Central

Abstract

Abstract Background Semantic segmentation of brain tumors plays a critical role in clinical treatment, especially for three-dimensional (3D) magnetic resonance imaging, which is often used in clinical practice. Automatic segmentation of the 3D structure of brain tumors can quickly help physicians understand the properties of tumors, such as the shape and size, thus improving the efficiency of preoperative planning and the odds of successful surgery. In past decades, 3D convolutional neural networks (CNNs) have dominated automatic segmentation methods for 3D medical images, and these network structures have achieved good results. However, to reduce the number of neural network parameters, practitioners ensure that the size of convolutional kernels in 3D convolutional operations generally does not exceed $$7 \times 7 \times 7$$ 7 × 7 × 7 , which also leads to CNNs showing limitations in learning long-distance dependent information. Vision Transformer (ViT) is very good at learning long-distance dependent information in images, but it suffers from the problems of many parameters. What’s worse, the ViT cannot learn local dependency information in the previous layers under the condition of insufficient data. However, in the image segmentation task, being able to learn this local dependency information in the previous layers makes a big impact on the performance of the model. Methods This paper proposes the Swin Unet3D model, which represents voxel segmentation on medical images as a sequence-to-sequence prediction. The feature extraction sub-module in the model is designed as a parallel structure of Convolution and ViT so that all layers of the model are able to adequately learn both global and local dependency information in the image. Results On the validation dataset of Brats2021, our proposed model achieves dice coefficients of 0.840, 0.874, and 0.911 on the ET channel, TC channel, and WT channel, respectively. On the validation dataset of Brats2018, our model achieves dice coefficients of 0.716, 0.761, and 0.874 on the corresponding channels, respectively. Conclusion We propose a new segmentation model that combines the advantages of Vision Transformer and Convolution and achieves a better balance between the number of model parameters and segmentation accuracy. The code can be found at https://github.com/1152545264/SwinUnet3D .

Keywords:
Segmentation Computer science Convolutional neural network Artificial intelligence Deep learning Image segmentation Scale-space segmentation Pattern recognition (psychology) Segmentation-based object categorization Feature extraction Medical imaging Computer vision

Metrics

83
Cited By
15.10
FWCI (Field Weighted Citation Impact)
32
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Brain Tumor Detection and Classification
Life Sciences →  Neuroscience →  Neurology
Medical Image Segmentation Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Medical image segmentation by combining feature enhancement Swin Transformer and UperNet

Zhang LiXiaochun YinXuqi LiuZengguang Liu

Journal:   Scientific Reports Year: 2025 Vol: 15 (1)Pages: 14565-14565
JOURNAL ARTICLE

SwinE-UNet3+: swin transformer encoder network for medical image segmentation

Ping ZouJian-Sheng Wu

Journal:   Progress in Artificial Intelligence Year: 2023 Vol: 12 (1)Pages: 99-105
JOURNAL ARTICLE

Swin Transformer Assisted Prior Attention Network for Medical Image Segmentation

Zhihao LiaoNeng FanKai Xu

Journal:   Applied Sciences Year: 2022 Vol: 12 (9)Pages: 4735-4735
JOURNAL ARTICLE

SSTrans-Net: Smart Swin Transformer Network for medical image segmentation

Liyao FuYunzhu ChenWei JiYang Feng

Journal:   Biomedical Signal Processing and Control Year: 2024 Vol: 91 Pages: 106071-106071
© 2026 ScienceGate Book Chapters — All rights reserved.