Dual Semantic Fusion Network for Video Object Detection

Lijian Lin; Haosheng Chen; Honglun Zhang; Jun Liang; Yu Li; Ying Shan; Hanzi Wang

doi:10.1145/3394171.3413583

ScienceGate Book Chapters

JOURNAL ARTICLE

Dual Semantic Fusion Network for Video Object Detection

Lijian Lin Haosheng Chen Honglun Zhang Jun Liang Yu Li Ying Shan Hanzi Wang

Year: 2020 Pages: 1855-1863

DOI: 10.1145/3394171.3413583

Get Full-Text PDF Get Analytical Report

Abstract

Video object detection is a tough task due to the deteriorated quality of\nvideo sequences captured under complex environments. Currently, this area is\ndominated by a series of feature enhancement based methods, which distill\nbeneficial semantic information from multiple frames and generate enhanced\nfeatures through fusing the distilled information. However, the distillation\nand fusion operations are usually performed at either frame level or instance\nlevel with external guidance using additional information, such as optical flow\nand feature memory. In this work, we propose a dual semantic fusion network\n(abbreviated as DSFNet) to fully exploit both frame-level and instance-level\nsemantics in a unified fusion framework without external guidance. Moreover, we\nintroduce a geometric similarity measure into the fusion process to alleviate\nthe influence of information distortion caused by noise. As a result, the\nproposed DSFNet can generate more robust features through the multi-granularity\nfusion and avoid being affected by the instability of external guidance. To\nevaluate the proposed DSFNet, we conduct extensive experiments on the ImageNet\nVID dataset. Notably, the proposed dual semantic fusion network achieves, to\nthe best of our knowledge, the best performance of 84.1\\% mAP among the current\nstate-of-the-art video object detectors with ResNet-101 and 85.4\\% mAP with\nResNeXt-101 without using any post-processing steps.\n

Keywords:

Computer science Dual (grammatical number) Object (grammar) Object detection Artificial intelligence Fusion Computer vision Pattern recognition (psychology)

Metrics

Cited By

2.10

FWCI (Field Weighted Citation Impact)

Refs

0.89

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Dual Semantic Fusion Network for Video Object Detection

Abstract

Metrics

Citation History

Topics

Related Documents

Dual Selection Network for Video Object Detection

Dual optical flow network-guided video object detection

Semantic-guided complementary fusion network for salient object detection

Semantic and Detail Fusion Network For Salient Object Detection

Semantic Fusion Based Graph Network for Video Scene Detection