Highly ranking position in the search engine query results can bring great benefits for websites. However, some websites use various techniques cheating search engine to increase their ranking, and thus affecting the quality of the answer provided to the user. TrustRank is a recent algorithm to combat web spam, which is based on the idea that good sites seldom point to spam sites, however, we find many spam sites can get lots of inlinks from good sites by using indecent tricks. We propose to take the variance of link structure into consideration, combining with which the ranking scores of websites are judged. As showing through experiments such a method can filter out web spam effectively.
Z GYONGYIH GARCIAMOLINAJ PEDERSEN
Zoltán GyöngyiHéctor García-MolinaJan Pedersen
Joyce Jiyoung WhangYeonsung JungSeonggoo KangDongho YooInderjit S. Dhillon
Guanggang GengChunheng WangQiu-Dan LiZhu Yuan-ping