In order to achieve object detection/location and tracking in video streams, this paper describes the video scene segmentation using visual and audio cues. For visual segmentation, frame distance and two techniques for speed-up are introduced. For audio segmentation, Cepstrum Flux and Block Cepstrum Flux parameters are introduced. Furthermore, experimental results of segmentation in both cases are described.
Tianxiang ChenZhentao TanTao GongQi ChuYue WuBin LiuNenghai YuLe LüJieping Ye
Kexin LiZongxin YangLei ChenYi YangJun Xiao