JOURNAL ARTICLE

Knowledge discovery interestingness measures based on unexpectedness

Kleanthis‐Nikolaos KontonasiosEirini SpyropoulouTijl De Bie

Year: 2012 Journal:   Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery Vol: 2 (5)Pages: 386-399   Publisher: Wiley

Abstract

Abstract Knowledge discovery methods often discover a large number of patterns. Although this can be considered of interest, it certainly presents considerable challenges too. Indeed, this set of patterns often contains lots of uninteresting patterns that risk overwhelming the data miner. In addition, a single interesting pattern can be discovered in a multitude of tiny variations that for all practical purposes are redundant. These issues are referred to as the pattern explosion problem . They lie at the basis of much recent research attempting to quantify interestingness and redundancy between patterns, with the purpose of filtering down a large pattern set to an interesting and compact subset. Many diverse approaches to interestingness and corresponding interestingness measures (IMs) have been proposed in the literature. Some of them, named objective IMs , define interestingness only based on objective criteria of the pattern and data at hand. S ubjective IMs additionally depend on the user's prior knowledge about the dataset. Formalizing unexpectedness is probably the most common approach for defining subjective IMs, where a pattern is deemed unexpected if it contradicts the user's expectations about the dataset. Such subjective IMs based on unexpectedness form the focus of this paper. We categorize measures based on unexpectedness into two major subgroups, namely, syntactical and probabilistic approaches. Based on this distinction, we survey different methods for assessing the unexpectedness of patterns with a special focus on frequent itemsets, tiles, association rules, and classification rules. © 2012 Wiley Periodicals, Inc. This article is categorized under: Algorithmic Development > Association Rules Algorithmic Development > Statistics

Keywords:
Computer science Redundancy (engineering) Categorization Association rule learning Probabilistic logic Set (abstract data type) Data mining Knowledge extraction Focus (optics) Information retrieval Data science Artificial intelligence Machine learning

Metrics

28
Cited By
12.17
FWCI (Field Weighted Citation Impact)
61
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
Rough Sets and Fuzzy Logic
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Unexpectedness as a measure of interestingness in knowledge discovery

Balaji PadmanabhanAlexander Tuzhilin

Journal:   Decision Support Systems Year: 1999 Vol: 27 (3)Pages: 303-318
JOURNAL ARTICLE

A survey of interestingness measures for knowledge discovery

Kenneth McGarry

Journal:   The Knowledge Engineering Review Year: 2005 Vol: 20 (1)Pages: 39-61
JOURNAL ARTICLE

Development Of Subjective Measures Of Interestingness: From Unexpectedness To Shocking

Eiad YafiM. A. AlamBiswas, Ranjit

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2007
JOURNAL ARTICLE

Development Of Subjective Measures Of Interestingness: From Unexpectedness To Shocking

Eiad YafiM. Afshar AlamRanjit Biswas

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2007
© 2026 ScienceGate Book Chapters — All rights reserved.