An Improved MapReduce Design of Kmeans with Iteration Reducing for Clustering Stock Exchange Very Large Datasets

Oussama Lachiheb; Mohamed Salah Gouider; Lamjed Ben Saïd

doi:10.1109/skg.2015.24

ScienceGate Book Chapters

JOURNAL ARTICLE

An Improved MapReduce Design of Kmeans with Iteration Reducing for Clustering Stock Exchange Very Large Datasets

Oussama Lachiheb Mohamed Salah Gouider Lamjed Ben Saïd

Year: 2015

DOI: 10.1109/skg.2015.24

Get Full-Text PDF Get Analytical Report

Abstract

This paper targets the problem of clustering very large datasets as one of the most challenging tasks for data mining and processing. We propose an improved MapReduce design of Kmeans algorithm with an iteration reducing method. Experiments show that this method reduces the number of iterations and the execution time of the Kmeans algorithm while keeping 80% of the clustering accuracy. The employment of MapReduce programming paradigm and iterations reducing techniques offers the possibility to process the huge volume of data generated by stock exchanges daily transactions which performs a better decision making by analysts.

Keywords:

Computer science Cluster analysis k-means clustering Data mining Volume (thermodynamics) Machine learning

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.20

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

Advanced Clustering Algorithms Research

Physical Sciences → Computer Science → Artificial Intelligence

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

An Improved MapReduce Design of Kmeans with Iteration Reducing for Clustering Stock Exchange Very Large Datasets

Abstract

Metrics

Citation History

Topics

Related Documents

An improved mapReduce design of kmeans for clustering very large datasets

Clustering very large multi-dimensional datasets with MapReduce

Clustering very large high-dimentional datasets based entropy with MapReduce

An Elastic Approximate Similarity Search in Very Large Datasets with MapReduce

Fast K-Means Clustering for Very Large Datasets Based on MapReduce Combined with a New Cutting Method