Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Haifeng Li; Yuejin Zhang; Ning Zhang

doi:10.1016/j.procs.2017.11.482

ScienceGate Book Chapters

JOURNAL ARTICLE

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Haifeng Li Yuejin Zhang Ning Zhang

Year: 2017 Journal: Procedia Computer Science Vol: 122 Pages: 1124-1132 Publisher: Elsevier BV

DOI: 10.1016/j.procs.2017.11.482

Get Full-Text PDF Get Analytical Report

Abstract

Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree, to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.

Keywords:

Probabilistic logic Computer science Data mining Probabilistic database Uncertain data Focus (optics) Association rule learning Function (biology) Database Artificial intelligence Relational database Database theory

Metrics

Cited By

1.12

FWCI (Field Weighted Citation Impact)

Refs

0.84

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

Rough Sets and Fuzzy Logic

Physical Sciences → Computer Science → Computational Theory and Mathematics

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Abstract

Metrics

Citation History

Topics

Related Documents

Finding Top-k Fuzzy Frequent Itemsets from Databases

Discovering probabilistic weighted frequent itemsets over uncertain data

Mining probabilistic generalized frequent itemsets in uncertain databases

Mining probabilistic frequent closed itemsets in uncertain databases

Incremental update on probabilistic frequent itemsets in uncertain databases