In visual surveillance, deep learning-based foreground object detection algorithms are superior to classical background subtraction (BGS)-based algorithms. However, deep learning-based methods are limited because detection performance deteriorates in a new environment different from the training environment. This limitation can be solved by retraining the model using additional ground-truth labels in the new environment. However, generating ground-truth labels for visual surveillance is time-consuming and expensive. This paper proposes a method that does not require foreground labels when adapting to a new environment. To this end, we propose an integrated network that produces two kinds of outputs a background model image and a foreground object map. We can adapt to the new environment by retraining using a background model image. The proposed method consists of one encoder and two decoders for detecting foreground objects and a background model image. It is designed to enable real-time processing with desktop GPUs. The proposed method shows 14.46% improved FM in a new environment different from training and 11.49% higher FM than the latest BGS algorithm.
Bingfeng LiErdong ShiHaohao RuanZainan JiangXinwei LiKeping WangShuai Wang
Fang LiuPeng ZhangJia LiuJingxiang YangXu TangLiang Xiao
Yufei YinJiajun DengWengang ZhouLi LiHouqiang Li
Xiaoheng JiangJian FengYan FengYang LuQuanhai FaWenjie ZhangMingliang Xu