Finding a set of high-frequency queries for high-frequency-query-based filter for similarity join

Kamolwan Kunanusont; Jaruloj Chongstitvatana

doi:10.1109/ecticon.2015.7206993

ScienceGate Book Chapters

JOURNAL ARTICLE

Finding a set of high-frequency queries for high-frequency-query-based filter for similarity join

Kamolwan Kunanusont Jaruloj Chongstitvatana

Year: 2015 Pages: 1-6

DOI: 10.1109/ecticon.2015.7206993

Get Full-Text PDF Get Analytical Report

Abstract

Similarity search and similarity join are two important operations in text databases. Filter-and-verify framework aims to reduce the comparison time by filtering out some pairs of texts before actually comparing the remaining pairs. Many filter methods do not take into account the repetition of the query words over time. A query which is frequently repeated over a time period is called a high-frequency query. High-frequency-queries-based filter is a filter method that deals with this type of queries. The performance of this method depends on the choice of high-frequency queries. This paper proposes methods to find the set of high-frequency queries from the given query set. One method is to use DBSCAN and the other is to use DBSCAN with merging strategy, called DBSM. The experimental results show that both DBSCAN and DBSM can find high-frequency queries, but the set of high-frequency queries obtained from DBSM gives higher the pruning power for high-frequency-queries-based filter.

Keywords:

Computer science Filter (signal processing) Set (abstract data type) Similarity (geometry) Query optimization Information retrieval Pruning Result set Data mining Artificial intelligence

Metrics

Cited By

0.27

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

Data Quality and Management

Social Sciences → Decision Sciences → Management Science and Operations Research

Finding a set of high-frequency queries for high-frequency-query-based filter for similarity join

Abstract

Metrics

Citation History

Topics

Related Documents

FINDING SETS OF HIGH-FREQUENCY QUERIES FOR HIGH-FREQUENCY-QUERY-BASED FILTER FOR SIMILARITY JOIN

Refining high-frequency-queries-based filter for similarity join

An index structure for similarity join based on high-frequency queries

Cluster Analysis to Find Sets of High-frequency Queries for Filtering in Similarity Join

Recommending Join Queries Based on Path Frequency