Scale-aware token-matching for transformer-based object detector

Aecheon Jung; Sungeun Hong; Yoonsuk Hyun

doi:10.1016/j.patrec.2024.08.006

ScienceGate Book Chapters

JOURNAL ARTICLE

Scale-aware token-matching for transformer-based object detector

Aecheon Jung Sungeun Hong Yoonsuk Hyun

Year: 2024 Journal: Pattern Recognition Letters Vol: 185 Pages: 197-202 Publisher: Elsevier BV

DOI: 10.1016/j.patrec.2024.08.006

Get Full-Text PDF Get Analytical Report

Abstract

Owing to the advancements in deep learning, object detection has made significant progress in estimating the positions and classes of multiple objects within an image. However, detecting objects of various scales within a single image remains a challenging problem. In this study, we suggest a scale-aware token matching to predict the positions and classes of objects for transformer-based object detection. We train a model by matching detection tokens with ground truth considering its size, unlike the previous methods that performed matching without considering the scale during the training process. We divide one detection token set into multiple sets based on scale and match each token set differently with ground truth, thereby, training the model without additional computation costs. The experimental results demonstrate that scale information can be assigned to tokens. Scale-aware tokens can independently learn scale-specific information by using a novel loss function, which improves the detection performance on small objects.

Keywords:

Computer science Security token Transformer Detector Matching (statistics) Artificial intelligence Scale (ratio) Pattern recognition (psychology) Computer vision Mathematics Computer network Electrical engineering Engineering Telecommunications Statistics Voltage Cartography

Metrics

Cited By

2.54

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Robotics and Automated Systems

Physical Sciences → Engineering → Control and Systems Engineering

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Currency Recognition and Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Scale-aware token-matching for transformer-based object detector

Abstract

Metrics

Citation History

Topics

Related Documents

Focal DETR: Target-Aware Token Design for Transformer-Based Object Detection

Object Detection via Multi-Scale Token Based on Vision Transformer

Token-word mixer meets object-aware transformer for referring image segmentation

T-SSD: A Transformer-based Single-Stage Multi-Scale Sampling Object Detector

Efficient Visual Object Tracking with Temporal Context-Aware Token Learning and Scale Adaptive Token Pruning