In this paper we introduce an approach for action recognition in motion capture data. The data are represented by the joints positions of the skeleton in each frame (posture vectors) and the differences of these positions over time, in different temporal scales. The Vector of Locally Aggregated Descriptors (VLAD) framework is used to encode the extracted features whereas a Support Vector Machine (SVM) is used for classification. A voting scheme is used in the VLAD framework to achieve soft encoding. The effectiveness and robustness of the proposed approach is shown in experiments performed on three datasets (MSRAction3D, MSRActionPairs and HDM05).
Fiza MurtazaMuhammad Haroon YousafSergio A. Velastín
Na LvZhiquan FengLingqiang RanXiuyang Zhao
Farkhunda YounasJunaid BaberTahir MahmoodJaveria FarooqMaheen Bakhtyar