JOURNAL ARTICLE

A dual-branch and dual attention transformer and CNN hybrid network for ultrasound image segmentation

Chong ZhangLingtong WangGuohui WeiZhiyong KongMin Qiu

Year: 2024 Journal:   Frontiers in Physiology Vol: 15 Pages: 1432987-1432987   Publisher: Frontiers Media

Abstract

Introduction Ultrasound imaging has become a crucial tool in medical diagnostics, offering real-time visualization of internal organs and tissues. However, challenges such as low contrast, high noise levels, and variability in image quality hinder accurate interpretation. To enhance the diagnostic accuracy and support treatment decisions, precise segmentation of organs and lesions in ultrasound image is essential. Recently, several deep learning methods, including convolutional neural networks (CNNs) and Transformers, have reached significant milestones in medical image segmentation. Nonetheless, there remains a pressing need for methods capable of seamlessly integrating global context with local fine-grained information, particularly in addressing the unique challenges posed by ultrasound images. Methods In this paper, to address these issues, we propose DDTransUNet, a hybrid network combining Transformer and CNN, with a dual-branch encoder and dual attention mechanism for ultrasound image segmentation. DDTransUNet adopts a Swin Transformer branch and a CNN branch to extract global context and local fine-grained information. The dual attention comprising Global Spatial Attention (GSA) and Global Channel Attention (GCA) modules to capture long-range visual dependencies. A novel Cross Attention Fusion (CAF) module effectively fuses feature maps from both branches using cross-attention. Results Experiments on three ultrasound image datasets demonstrate that DDTransUNet outperforms previous methods. In the TN3K dataset, DDTransUNet achieves IoU, Dice, HD95 and ACC metrics of 73.82%, 82.31%, 16.98 mm, and 96.94%, respectively. In the BUS-BRA dataset, DDTransUNet achieves 80.75%, 88.23%, 8.12 mm, and 98.00%. In the CAMUS dataset, DDTransUNet achieves 82.51%, 90.33%, 2.82 mm, and 96.87%. Discussion These results indicate that our method can provide valuable diagnostic assistance to clinical practitioners.

Keywords:
Computer science Artificial intelligence Convolutional neural network Segmentation Encoder Deep learning Pattern recognition (psychology) Context (archaeology) Computer vision

Metrics

13
Cited By
10.65
FWCI (Field Weighted Citation Impact)
84
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Radiomics and Machine Learning in Medical Imaging
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.