Fianet: Video Object Detection Via Joint Feature-Level And Instance-Level Aggregation

Zhengshuai Wang; Yali Li; Shengjin Wang

doi:10.1109/icme46284.2020.9102955

ScienceGate Book Chapters

JOURNAL ARTICLE

Fianet: Video Object Detection Via Joint Feature-Level And Instance-Level Aggregation

Zhengshuai Wang Yali Li Shengjin Wang

Year: 2020 Pages: 1-6

DOI: 10.1109/icme46284.2020.9102955

Get Full-Text PDF Get Analytical Report

Abstract

Video object detection task is challenging due to the nonrigid and rigid appearance deformations in videos. Most of the typical competitive methods are to enhance per-frame features through aggregating lots of previous and future frames. But feature-level aggregation isn't robust to rigid deformations such as occlusion and rare postures. In this paper, we propose an online video object detection method with joint feature-level aggregation and instance-level aggregation network (FIANet). Besides feature-level aggregation, we design a spatial-temporal instance calibration module (STIC) to aggregate the instance as a whole, which can reduce the interference of local distorted and missed pixels. Joint featurelevel and instance-level aggregation can work collaboratively to overcome different deformations. Only using less previous frames, our method can achieve 81.6% mAP with relatively high speed on ImageNet VID, which is state-of-the-art compared with causal and non-causal methods.

Keywords:

Computer science Artificial intelligence Aggregate (composite) Feature (linguistics) Joint (building) Frame (networking) Computer vision Object (grammar) Object detection Pixel Pattern recognition (psychology) Feature extraction Interference (communication) Engineering

Metrics

Cited By

0.10

FWCI (Field Weighted Citation Impact)

Refs

0.38

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Fianet: Video Object Detection Via Joint Feature-Level And Instance-Level Aggregation

Abstract

Metrics

Citation History

Topics

Related Documents

Object-Level Feature Memory and Aggregation for Live-Stream Video Object Detection

Video Object Detection via Object-Level Temporal Aggregation

Instance-level feature representation calibration for visual object detection

Object detection based on few-shot learning via instance-level feature correlation and aggregation

Temporal Based Instance-Level Fusion for Video Object Detection