The Convolutional Neural Networks (CNNs) have achieved breakthroughs on several image retrieval benchmarks. Most previous works re-formulate CNNs as global feature extractors used for linear scan. This paper proposes a Multi-layer Orderless Fusion (MOF) approach to integrate the activations of CNN in the Bag-of-Words (BoW) framework. Specifically, through only one forward pass in the network, we extract multi-layer CNN activations of local patches. Activations from each layer are aggregated in one BoW model, and several BoW models are combined with late fusion. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed method.
Ying LiXiangwei KongHaiyan FuQi Tian
Nuha M. KhassafShaimaa H. Shaker
Jun XiangNing ZhangRuru PanWeidong Gao
Xiaoqiang LuYaxiong ChenXuelong Li