JOURNAL ARTICLE

Adaptive Sparse Memory Networks for Efficient and Robust Video Object Segmentation

Jisheng DangHuicheng ZhengXiaohao XuLongguang WangQingyong HuYulan Guo

Year: 2024 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 36 (2)Pages: 3820-3833   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Recently, memory-based networks have achieved promising performance for video object segmentation (VOS). However, existing methods still suffer from unsatisfactory segmentation accuracy and inferior efficiency. The reasons are mainly twofold: 1) during memory construction, the inflexible memory storage mechanism results in a weak discriminative ability for similar appearances in complex scenarios, leading to video-level temporal redundancy, and 2) during memory reading, matching robustness and memory retrieval accuracy decrease as the number of video frames increases. To address these challenges, we propose an adaptive sparse memory network (ASM) that efficiently and effectively performs VOS by sparsely leveraging previous guidance while attending to key information. Specifically, we design an adaptive sparse memory constructor (ASMC) to adaptively memorize informative past frames according to dynamic temporal changes in video frames. Furthermore, we introduce an attentive local memory reader (ALMR) to quickly retrieve relevant information using a subset of memory, thereby reducing frame-level redundant computation and noise in a simpler and more convenient manner. To prevent key features from being discarded by the subset of memory, we further propose a novel attentive local feature aggregation (ALFA) module, which preserves useful cues by selectively aggregating discriminative spatial dependence from adjacent frames, thereby effectively increasing the receptive field of each memory frame. Extensive experiments demonstrate that our model achieves state-of-the-art performance with real-time speed on six popular VOS benchmarks. Furthermore, our ASM can be applied to existing memory-based methods as generic plugins to achieve significant performance improvements. More importantly, our method exhibits robustness in handling sparse videos with low frame rates.

Keywords:
Computer science Robustness (evolution) Discriminative model Artificial intelligence Segmentation Memory map Pattern recognition (psychology) Computer vision Shared memory Computer hardware

Metrics

10
Cited By
5.30
FWCI (Field Weighted Citation Impact)
77
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Temporo-Spatial Parallel Sparse Memory Networks for Efficient Video Object Segmentation

Jisheng DangHuicheng ZhengBimei WangLongguang WangYulan Guo

Journal:   IEEE Transactions on Intelligent Transportation Systems Year: 2024 Vol: 25 (11)Pages: 17291-17304
JOURNAL ARTICLE

Video Object Segmentation with Dynamic Memory Networks and Adaptive Object Alignment

Shuxian LiangXu ShenJianqiang HuangXian‐Sheng Hua

Journal:   2021 IEEE/CVF International Conference on Computer Vision (ICCV) Year: 2021
JOURNAL ARTICLE

Boosting Video Object Segmentation via Robust and Efficient Memory Network

Yadang ChenDingwei ZhangYuhui ZhengZhi-Xin YangEnhua WuHaixing Zhao

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2023 Vol: 34 (5)Pages: 3340-3352
© 2026 ScienceGate Book Chapters — All rights reserved.