Abstract

Capturing rich multi-scale features is essential for resolving complex variations in medical image segmentation. In this paper, we explore how to fully utilize the advantages of Convolutional neural networks (CNN) and Transformer, and propose a novel multi-stage aggregation architecture named MA-Transformer for accurate segmentation of medical images with large variations and blurs. Specifically, an encoder module is introduced in each stage, which is a dual-branch structure parallelly combining Transformers and convolutions. By such design, the self-attention can provide a global context for CNN to extract multi-resolution complementary features stage by stage, thus the feature representations are gradually enhanced with local details and contextual information. Multi-scale semantic features are then combined with skip connections in the decoder to produce the final result. Extensive experiments on public medical imaging datasets demonstrate our superior segmentation performance, compared to the state-of-the-art CNN-based, Transformer-based approaches and CNN-Transformer combined approaches. Code will be made publicly available.

Keywords:
Computer science Segmentation Encoder Transformer Convolutional neural network Artificial intelligence Pattern recognition (psychology) Image segmentation Computer vision Voltage Engineering

Metrics

4
Cited By
1.02
FWCI (Field Weighted Citation Impact)
41
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Medical Image Segmentation Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

MINTFormer: Multi-Scale Information Aggregation with CSWin Vision Transformer for Medical Image Segmentation

Chao DengXiao Qin

Journal:   Applied Sciences Year: 2025 Vol: 15 (15)Pages: 8626-8626
JOURNAL ARTICLE

Multi-axis vision transformer for medical image segmentation

Abdul Rehman KhanAsifullah Khan

Journal:   Engineering Applications of Artificial Intelligence Year: 2025 Vol: 158 Pages: 111251-111251
JOURNAL ARTICLE

SEAformer: Selective Edge Aggregation transformer for 2D medical image segmentation

Jingwen LiJi‐Long ChenLei JiangRuoyu LiPei‐Lun HanJunlong Cheng

Journal:   Biomedical Signal Processing and Control Year: 2024 Vol: 102 Pages: 107203-107203
© 2026 ScienceGate Book Chapters — All rights reserved.