BOOK-CHAPTER

Multi-stage Semantic Attention with Transformer for Multi-label Image Classification

Qizhen DuYing MaJianmin Li

Year: 2023 Atlantis Highlights in Computer Sciences/Atlantis highlights in computer sciences Pages: 1193-1199   Publisher: Atlantis Press

Abstract

Multi-label image classification is a fundamental classification task, which seeks to assign numerous possible labels to an image.Many deep convolutional neural network (CNN)-based approaches to discovering the semantics of labels and learning the semantic representation of images by modeling label correlation have been proposed in recent years.However, some small and similar objects cannot be predicted accurately due to the limitation of convolutional kernel representation capability.As a result, in order to solve this problem, this paper introduces twins-transformer.Since different stages of image representation of this model capture different levels or scales of features and have different discriminative capacities, we design a multi-stage semantic attention with transformer (MAST) framework to learn the semantic representation of images using its own multi-stage mechanism, while employing a three-layer standard transformer decoder as an effective component for feature fusion.Experiments conducted on the VOC 2007 dataset show that MSAT achieves better experimental results and improves the performance of multi-label image classification tasks to some extent.

Keywords:
Discriminative model Computer science Pattern recognition (psychology) Artificial intelligence Convolutional neural network Transformer Kernel (algebra) Contextual image classification Machine learning Semantic feature Image (mathematics) Mathematics Engineering

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
43
Refs
0.04
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Graph Attention Transformer Network for Multi-label Image Classification

Jin YuanShikai ChenYao ZhangZhongchao ShiXin GengJianping FanYong Rui

Journal:   ACM Transactions on Multimedia Computing Communications and Applications Year: 2022 Vol: 19 (4)Pages: 1-16
JOURNAL ARTICLE

DATran: Dual Attention Transformer for Multi-Label Image Classification

Wei ZhouZhijie ZhengTao SuHaifeng Hu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 34 (1)Pages: 342-356
JOURNAL ARTICLE

A multi-label image classification method combining multi-stage image semantic information and label relevance

Liwen WuLei ZhaoPeigeng TangBin PuXin JinYudong ZhangShaowen Yao

Journal:   International Journal of Machine Learning and Cybernetics Year: 2024 Vol: 15 (9)Pages: 3911-3925
© 2026 ScienceGate Book Chapters — All rights reserved.