Shuffle window transformer DeepLabV3+: a lightweight convolutional neural network and transformer based hybrid semantic segmentation network

Yane Li; Zhichao Chen; Hongxia Qi; Ming Fan; Lihua Li

doi:10.1088/2632-2153/add853

ScienceGate Book Chapters

JOURNAL ARTICLE

Shuffle window transformer DeepLabV3+: a lightweight convolutional neural network and transformer based hybrid semantic segmentation network

Yane Li Zhichao Chen Hongxia Qi Ming Fan Lihua Li

Year: 2025 Journal: Machine Learning Science and Technology Vol: 6 (2)Pages: 025039-025039 Publisher: IOP Publishing

DOI: 10.1088/2632-2153/add853

Get Full-Text PDF Get Analytical Report

Abstract

Abstract Semantic segmentation is a critical task in computer vision. Constructing complex semantic segmentation models with high accuracy, low spatial occupancy, and low computational complexity remains a challenge. To address this, this paper proposes a semantic segmentation network based on a hybrid architecture of convolutional neural network and Transformer, named shuffle window transformer DeeplabV3+ (SWT-DeepLabV3+). The network introduces a new module, called the SWT. When the window size is fixed, by integrating window attention (WA) and shuffle WA mechanisms, cross-window global context modeling with linear computational complexity is achieved. Additionally, we enhance the atrous spatial pyramid pooling (ASPP) by incorporating strip pooling to construct a strip ASPP, effectively extracting both regular and irregular multi-scale (MS) features. Simultaneously, the network adopts adaptive spatial feature fusion in the shallow layers. Dynamic adjustment of MS feature weights improves the backbone network’s ability to capture shallow discriminative features. Experimental results demonstrate that on three public datasets (PASCAL VOC 2012, Cityscapes, and CamVid), SWT-DeepLabV3+ exhibits outstanding segmentation performance under conditions of lower parameter count and computational cost, validating the model’s capability to achieve efficient processing while maintaining high accuracy.

Keywords:

Transformer Computer science Convolutional neural network Segmentation Artificial intelligence Engineering Electrical engineering

Metrics

Cited By

4.77

FWCI (Field Weighted Citation Impact)

Refs

0.84

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Currency Recognition and Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Neural Networks and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Shuffle window transformer DeepLabV3+: a lightweight convolutional neural network and transformer based hybrid semantic segmentation network

Abstract

Metrics

Citation History

Topics

Related Documents

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

Lightweight Semantic Segmentation Network Based on DeepLabV3+

Semantic Segmentation Network Based on Lightweight Feature Pyramid Transformer

Hybrid semantic segmentation for tunnel lining cracks based on Swin Transformer and convolutional neural network

Lightweight Semantic Segmentation Convolutional Neural Network Based on SKNet