Aiming at the problem of data sparseness and system scalability in traditional collaborative filtering algorithms,this paper proposed a new collaborative filtering algorithm.This algorithm firstly obtains the item rating matrix according to users’ rating on different items.Secondly,it utilizes naive Bayesian classifier to classify the items,and then searches for the items’ nearest-neighbor sets in the same class by the modified cosine similarity computation method.At the same time,it uses the MapReduce parallel computation framework on Hadoop to implement distributed data processing.Finally,it forms a rating prediction list and makes recommendations.Experimental results show that the algorithm not only effectively solves the problem of low prediction accuracy due to data sparseness,but also improves the accuracy of recommendation compared with collaborative filtering algorithm based on user classification and that based on item classification.It greatly improves the efficiency and scalability of the system by parallel computation.
Weiming HuFangqun YangZishuo Feng
Zhilin ZhaoChang‐Dong WangYuanyu WanZi-Wei HuangJianhuang Lai
Zilei SunNianlong LuoWei Kuang