JOURNAL ARTICLE

Improved deep learning image classification algorithm based on Swin Transformer V2

Jiangshu WeiJinrong ChenYuchao WangHao LuoWujie Li

Year: 2023 Journal:   PeerJ Computer Science Vol: 9 Pages: e1665-e1665   Publisher: PeerJ, Inc.

Abstract

While convolutional operation effectively extracts local features, their limited receptive fields make it challenging to capture global dependencies. Transformer, on the other hand, excels at global modeling and effectively captures global dependencies. However, the self-attention mechanism used in Transformers lacks a local mechanism for information exchange within specific regions. This article attempts to leverage the strengths of both Transformers and convolutional neural networks (CNNs) to enhance the Swin Transformer V2 model. By incorporating both convolutional operation and self-attention mechanism, the enhanced model combines the local information-capturing capability of CNNs and the long-range dependency-capturing ability of Transformers. The improved model enhances the extraction of local information through the introduction of the Swin Transformer Stem, inverted residual feed-forward network, and Dual-Branch Downsampling structure. Subsequently, it models global dependencies using the improved self-attention mechanism. Additionally, downsampling is applied to the attention mechanism’s Q and K to reduce computational and memory overhead. Under identical training conditions, the proposed method significantly improves classification accuracy on multiple image classification datasets, showcasing more robust generalization capabilities.

Keywords:
Computer science Transformer Artificial intelligence Convolutional neural network Upsampling Residual Leverage (statistics) Pattern recognition (psychology) Machine learning Algorithm Engineering Image (mathematics) Voltage

Metrics

5
Cited By
1.43
FWCI (Field Weighted Citation Impact)
41
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Industrial Vision Systems and Defect Detection
Physical Sciences →  Engineering →  Industrial and Manufacturing Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Technologies in Various Fields
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.