JOURNAL ARTICLE

PERFORMANCE OF VISION TRANSFORMER ON GARBAGE IMAGE CLASSIFICATION

Nam Tran Quy

Year: 2026 Journal:   Journal of Engineering Management and Information Technology Vol: 4 (1)Pages: 25-34

Abstract

This study makes an evaluation of the performance on Vision Transformer model with size of 16x16 words (ViT 16x16) for classifying of garbage images. There are some convolutional neural network (CNN) with technique of transfer learning, namely VGG16, ResNet50, InceptionV3, EfficientNetB7, which are employed for comparison. In each implementation of respective model, the same techniques for image augmentation and hyper-parameters such as, optimizer, activation function and learning rate are employed as the same values among all models. The same dataset of garbage was also applied into those models with the similar splitting on dataset of training, validation and testing. The dataset with 12 different image labels with various kinds of garbage are employed. The experimental results on performance of all models brings the fact that the ViT 16x16 gave the best results at 92%, which is higher the second best model namely VGG16 at 86% and much higher than most of other pre-train models in evaluating garbage images classification.

Keywords:
Garbage Convolutional neural network Transformer Pattern recognition (psychology) Artificial neural network Image (mathematics) Image processing Contextual image classification

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Brain Tumor Detection and Classification
Life Sciences →  Neuroscience →  Neurology
Scientific and Engineering Research Topics
Health Sciences →  Dentistry →  Periodontics
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.