JOURNAL ARTICLE

Mining Approximate Frequent Itemsets from Noisy Data

Abstract

Frequent itemset mining is a popular and important first step in analyzing data sets across a broad range of applications. The traditional, "exact" approach for finding frequent itemsets requires that every item in the itemset occurs in each supporting transaction. However, real data is typically subject to noise, and in the presence of such noise, traditional itemset mining may fail to detect relevant itemsets, particularly those large itemsets that are more vulnerable to noise. In this paper we propose approximate frequent itemsets (AFI), as a noise-tolerant itemset model. In addition to the usual requirement for sufficiently many supporting transactions, the AFI model places constraints on the fraction of errors permitted in each item column and the fraction of errors permitted in a supporting transaction. Taken together, these constraints winnow out the approximate itemsets that exhibit systematic errors. In the context of a simple noise model, we demonstrate that AFI is better at recovering underlying data patterns, while identifying fewer spurious patterns than either the exact frequent itemset approach or the existing error tolerant itemset approach of Yang et al.

Keywords:
Spurious relationship Data mining Computer science Noise (video) Context (archaeology) Database transaction Fraction (chemistry) Transaction data Algorithm Database Artificial intelligence Machine learning

Metrics

26
Cited By
9.02
FWCI (Field Weighted Citation Impact)
11
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
Rough Sets and Fuzzy Logic
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Mining Approximate Frequent Itemsets over Data Streams

Na SuZhe Hui WuJi Min LiuTai An LiuXin AnChang Qing Yan

Journal:   Applied Mechanics and Materials Year: 2014 Vol: 685 Pages: 536-539
JOURNAL ARTICLE

A new approximate method for mining frequent itemsets from big data

Timur ValiullinZhexue HuangChenghao WeiJianfei YinDingming WuIuliia Egorova

Journal:   Computer Science and Information Systems Year: 2020 Vol: 18 (3)Pages: 641-656
BOOK-CHAPTER

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams

Jia-Ling KohShu-Ning Shin

Lecture notes in computer science Year: 2006 Pages: 352-362
BOOK-CHAPTER

Mining Frequent Itemsets from Uncertain Data

Chun-kit ChuiBen KaoEdward Hung

Lecture notes in computer science Year: 2007 Pages: 47-58
© 2026 ScienceGate Book Chapters — All rights reserved.