Classical methods of multi-view human action recognition focus on constructing similar feature between each view so that videos under different view can be classified identically. However, the features from a certain action under different views are variable especially when the view changes sharply. Using the similarity from different views solely can't get a desirable result. In this paper we proposed a joint learining model for jointly learning features from different views. Our model explores the shared information of an action under different views and the specific information of an action under a certain view. The combination of the shared and specific information shows a distinguishable feature. The features from different views show are reconstructed by a linear projection matrix so that they can show a same structure. To obtain a optimal solution under a certain convergence, the model consists of a three-step iterative optimization process. The effectiveness of our method has been verified on WVU dataset.
Tianzhu ZhangSi LiuChangsheng XuHanqing Lu
Chengcheng JiaSujing WangXiangli XuChunguang ZhouLibiao Zhang
Tong HaoDan WuQian WangJinsheng Sun
An-An LiuNing XuYuting SuHong LinTong HaoZhaoxuan Yang