Swin Transformer with Local Aggregation

Lu Chen; Yong Bai; Qiang Cheng; Mei Wu

doi:10.1109/ispds56360.2022.9874052

ScienceGate Book Chapters

JOURNAL ARTICLE

Swin Transformer with Local Aggregation

Lu Chen Yong Bai Qiang Cheng Mei Wu

Year: 2022 Vol: 34 Pages: 77-81

DOI: 10.1109/ispds56360.2022.9874052

Get Full-Text PDF Get Analytical Report

Abstract

Despite the many advantages of Convolutional Neural Networks (CNN), their perceptual fields are usually small and not conducive to capturing global features. In contrast, Transformer is able to capture long-range dependencies and obtain global information of an image with self-attention. For combining the advantages of CNN and Transformer, we propose to integrate the Local Aggregation module to the structure of Swin Transformer. The Local Aggregation module includes lightweight Depthwise Convolution and Pointwise Convolution, and it can locally capture the information of feature map at stages of Swin Transformer. Our experiments demonstrate that accuracy can be improved with such an integrated model. On the Cifar-10 dataset, the Top-1 accuracy reaches 87.74%, which is 3.32% higher than Swin, and the Top-5 accuracy reaches 99.54%; on the Mini-ImageNet dataset, the Top-1 accuracy reaches 79.1%, which is 7.68% higher than Swin, and the Top-5 accuracy reaches 94.02%, which is 3.25% higher than Swin 3.25%.

Keywords:

Pointwise Computer science Convolutional neural network Transformer Artificial intelligence Pattern recognition (psychology) Mathematics Voltage Engineering

Metrics

Cited By

0.37

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Industrial Vision Systems and Defect Detection

Physical Sciences → Engineering → Industrial and Manufacturing Engineering

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Swin Transformer with Local Aggregation

Abstract

Metrics

Citation History

Topics

Related Documents

Enhanced Landslide Detection Using a Swin Transformer With Multiscale Feature Fusion and Local Information Aggregation Modules

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation

Energy Consumption Optimization of Swin Transformer Based on Local Aggregation and Group-Wise Transformation

Swin Transformer With Late-Fusion Feature Aggregation for Multi-Modal Vehicle Reidentification

SparseSwin: Swin transformer with sparse transformer block