JOURNAL ARTICLE

Probabilistic maximal frequent itemset mining methods over uncertain databases

Haifeng LiMo HaiNing ZhangJianming ZhuYue WangHuaihu Cao

Year: 2019 Journal:   Intelligent Data Analysis Vol: 23 (6)Pages: 1219-1241   Publisher: IOS Press

Abstract

Uncertain data are data accompanied with probability, which makes frequent itemset mining more challenging. This paper focuses on the problem of mining probabilistic maximal frequent itemsets. We redefine the concept of probabilistic maximal frequent itemset to be consistent with the traditional de finition and provide a better view on how to devise pruning strategies. A tree-based index called the probabilistic maximal frequent itemset tree is constructed to maintain the probabilistic frequent itemsets. We proposed a depth-first probabilistic maximal frequent itemset mining algorithm to bottom-up generate the exact results, in which support and expected support are used to estimate the range of probabilistic support, enabling the frequency of an itemset to be inferred with less runtime and memory usage. Also, superset pruning is employed to further reduce mining cost. Nevertheless, certain probabilistic supports have to be computed when the minimum support is low, which may result in highly increased mining speed. This problem is addressed in our approximate probabilistic maximal frequent itemset mining method, which uses the expected support to directly compute the probabilistic support. Theoretical analysis and experimental studies demonstrate that our proposed algorithms have high accuracy, expend less computational time and use less memory, and significantly outperform the TODIS-MAX [20] state-of-the-art algorithm.

Keywords:
Probabilistic logic Pruning Probabilistic database Data mining Computer science Uncertain data Tree (set theory) Probabilistic analysis of algorithms Mathematics Artificial intelligence Database model

Metrics

1
Cited By
0.36
FWCI (Field Weighted Citation Impact)
9
Refs
0.70
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
Rough Sets and Fuzzy Logic
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.