We present a robust algorithm for temporally coherent video segmentation. Our approach is driven by multi-label graph cut applied to successive frames, fusing information from the current frame with an appearance model and labeling priors propagated forwarded from past frames. We propagate using a novel motion diffusion model, producing a per-pixel motion distribution that mitigates against cumulative estimation errors inherent in systems adopting “hard” decisions on pixel motion at each frame. Further, we encourage spatial coherence by imposing label consistency constraints within image regions (super-pixels) obtained via a bank of unsupervised frame segmentations, such as mean-shift. We demonstrate quantitative improvements in accuracy over state-of-the-art methods on a variety of sequences exhibiting clutter and agile motion, adopting the Berkeley methodology for our comparative evaluation.
Hodjat RahmatiRalf DragonOle Morten AamoLuc Van GoolLars Adde
Kaihua ChenDeva RamananTarasha Khurana
Yufeng WangWenrui DingBaochang ZhangHongguang LiShuo Liu
Sylvain GouttardMartin StynerSarang JoshiRachel G. SmithHeather C. HazlettGuido Gerig
Gyu-Lee JeonYeonjin LeeJung-Kyung LeeYonghwan KimJe‐Won Kang