Person re-identification is a challenging problem in multi-camera surveillance systems. Most current methods always aim at learning a global distance metric to overcome the visual appearance changes between images from different cameras. However, the feature variations between images are not constant over the entire feature space, thus one global metric is not always applicable to all feature variation conditions. Moreover, few of the current methods take the modality discrepancy between different cameras into consideration during the metric learning process. To address these drawbacks, we propose Parametric Local MultiModal (PLMM) metric learning in this paper. We consider the feature difference value of one pair of images, which come from different cameras, characterizing one kind of feature variations. Accordingly, we learn a unique local metric for each image pair. Moreover, images from different cameras are regarded as lying on different modalities, and thus in each local metric, two different projection matrixes are learned for the cross-modality similarity measures. To balance the locality and computation efficiency, the local metrics are parameterized as weighted linear combinations of basis metrics, which correspond to a small set of anchor image pairs. The local metrics are capable of modeling the cross-modality feature variations among different cameras, making the distance of intra-class image pairs minimized and simultaneously those of inter-class image pairs maximized. Experimental results demonstrate that the proposed approach obtains competitive performance compared with state-of-the-art methods on three publicly available benchmarks.
Venice Erin LiongJiwen LuYongxin Ge
Yongxin GeXinqian GuMin ChenHongxing WangDan Yang
Muhammad Adnan SyedJianbin Jiao