JOURNAL ARTICLE

Dynamic Anchor Box-based Instance Decoding and Position-aware Instance Association for Online Video Instance Segmentation

Hyun-Jin ChunIncheol Kim

Year: 2023 Journal:   Journal of Institute of Control Robotics and Systems Vol: 29 (9)Pages: 755-766

Abstract

Video instance segmentation (VIS) is a vision task that involves simultaneously detecting, classifying, segmenting, and tracking object instances in videos. In this study, we introduce dynamic anchor box and deformable attention for VIS (DAB-D-VIS), a novel transformer-based model for online VIS. To enhance the multilayer transformer-based instance decoding for each video frame, our proposed model uses deformable attention mechanisms that focus on a small set of key sampling points. Additionally, dynamic anchor boxes are employed to explicitly represent the region of candidate instances. These two methods have already been proven to be effective for transformer-based object detection from images. Furthermore, to address the constraints of online VIS, our model incorporates a robust inter-frame instance association method. This method leverages both similarity in the contrastive embedding space and positional difference in the images between two instances. Extensive experiments conducted on the YouTube-VIS benchmark dataset validate the effectiveness of our proposed DAB-D-VIS model.

Keywords:
Computer science Artificial intelligence Segmentation Computer vision Embedding Benchmark (surveying) Video tracking Decoding methods Transformer Pattern recognition (psychology) Object (grammar) Algorithm

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.12
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.