JOURNAL ARTICLE

UNETR++ with Voxel-Focused Attention: Efficient 3D Medical Image Segmentation with Linear-Complexity Transformers

Sithembiso NtanziSerestina Viriri

Year: 2025 Journal:   Applied Sciences Vol: 15 (20)Pages: 11034-11034   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

There have been significant breakthroughs in developing models for segmenting 3D medical images, with many promising results attributed to the incorporation of Vision Transformers (ViT). However, the fundamental mechanism of transformers, known as self-attention, has quadratic complexity, which significantly increases computational requirements, especially in the case of 3D medical images. In this paper, we investigate the UNETR++ model and propose a voxel-focused attention mechanism inspired by TransNeXt pixel-focused attention. The core component of UNETR++ is the Efficient Paired Attention (EPA) block, which learns from two interdependent branches: spatial and channel attention. For spatial attention, we incorporated the voxel-focused attention mechanism, which has linear complexity with respect to input sequence length, rather than projecting the keys and values into lower dimensions. The deficiency of UNETR++ lies in its reliance on dimensionality reduction for spatial attention, which reduces efficiency but risks information loss. Our contribution is to replace this with a voxel-focused attention design that achieves linear complexity without low-dimensional projection, thereby reducing parameters while preserving representational power. This effectively reduces the model’s parameter count while maintaining competitive performance and inference speed. On the Synapse dataset, the enhanced UNETR++ model contains 21.42 M parameters, a 50% reduction from the original 42.96 M, while achieving a competitive Dice score of 86.72%.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.43
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Medical Image Segmentation Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.