Yueshen XuGuang-can XIAOXiaofen Tang
an efficient multi-level feature fusion descriptor for human action recognition is introduced in the paper.The descriptor is built by the low-level features, which include three trajectory features, HOF and S IFT combination with the midlevel class correlation feature.Inspired by the recent popularity of dense trajectories in image recognition, they have been utilize d to represent actions.It is favorable to extract scene information for action recognition, since human actions have the tightly affinity on specific natural scenes.In addition, noting that different action classes may often share similar motion patterns, we introduce the mid-level class correlation feature to describe relationships among different video classes.Finally, to achieve the better recognition results, bag-of-word model is employed to describe the video by sets of visual words.The average accuracy of the proposed method for action recognition is up to 92.6% on UCF sports dataset.
Wei SongPei YangNingning LiuGuosheng YangFuhong Lin