JOURNAL ARTICLE

Graph Attention Transformer Network for Multi-label Image Classification

Jin YuanShikai ChenYao ZhangZhongchao ShiXin GengJianping FanYong Rui

Year: 2022 Journal:   ACM Transactions on Multimedia Computing Communications and Applications Vol: 19 (4)Pages: 1-16   Publisher: Association for Computing Machinery

Abstract

Multi-label classification aims to recognize multiple objects or attributes from images. The key to solving this issue relies on effectively characterizing the inter-label correlations or dependencies, which bring the prevailing graph neural network. However, current methods often use the co-occurrence probability of labels based on the training set as the adjacency matrix to model this correlation, which is greatly limited by the dataset and affects the model’s generalization ability. This article proposes a Graph Attention Transformer Network, a general framework for multi-label image classification by mining rich and effective label correlation. First, we use the cosine similarity value of the pre-trained label word embedding as the initial correlation matrix, which can represent richer semantic information than the co-occurrence one. Subsequently, we propose the graph attention transformer layer to transfer this adjacency matrix to adapt to the current domain. Our extensive experiments have demonstrated that our proposed methods can achieve highly competitive performance on three datasets.

Keywords:
Adjacency matrix Computer science Adjacency list Pattern recognition (psychology) Artificial intelligence Graph Embedding Transformer Correlation Attention network Machine learning Graph embedding Data mining Cosine similarity Artificial neural network Theoretical computer science Algorithm Mathematics

Metrics

39
Cited By
7.64
FWCI (Field Weighted Citation Impact)
42
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Double Attention Based on Graph Attention Network for Image Multi-Label Classification

Wei ZhouZhiwu XiaPeng DouTao SuHaifeng Hu

Journal:   ACM Transactions on Multimedia Computing Communications and Applications Year: 2022 Vol: 19 (1)Pages: 1-23
JOURNAL ARTICLE

Modular Graph Transformer Networks for Multi-Label Image Classification

Hoang D. NguyenXuan-Son VuDuc-Trong Le

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2021 Vol: 35 (10)Pages: 9092-9100
JOURNAL ARTICLE

DATran: Dual Attention Transformer for Multi-Label Image Classification

Wei ZhouZhijie ZhengTao SuHaifeng Hu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 34 (1)Pages: 342-356
JOURNAL ARTICLE

Multi-Label Image Classification by Feature Attention Network

Zheng YanWeiwei LiuShiping WenYin Yang

Journal:   IEEE Access Year: 2019 Vol: 7 Pages: 98005-98013
© 2026 ScienceGate Book Chapters — All rights reserved.