Haifeng LiYuejin ZhangNing Zhang
Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.
Haifeng LiYue WangNing ZhangYuejin Zhang
Tao YouTingfeng LiChenglie DuXiang ZhaiNan Jiang
Ming-Yen LinCheng-Tai FuSue-Chen Hsueh