JOURNAL ARTICLE

Parallel algorithms for mining large-scale rich-media data

Abstract

The amount of online photos and videos is now at the scale of tens of billions. To organize, index, and retrieve these large-scale rich-media data, a system must employ scalable data management and mining algorithms. The research community needs to consider solving large scale problems rather than solving problems with small datasets that do not reflect real life scenarios. This tutorial introduces key challenges in large-scale rich-media data mining, and presents parallel algorithms for tackling such challenges. We present our parallel implementations of Spectral Clustering (PSC), FP-Growth (PFP), Latent Dirichlet Allocation (PLDA), and Support Vector Machines (PSVM).

Keywords:
Computer science Latent Dirichlet allocation Implementation Scalability Scale (ratio) Cluster analysis Data mining Key (lock) Spectral clustering Big data Support vector machine Data science Algorithm Machine learning Topic model Artificial intelligence Database

Metrics

41
Cited By
0.93
FWCI (Field Weighted Citation Impact)
11
Refs
0.80
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.