Text Mining using Non-Negative Matrix Factorizations

V. Paúl Pauca; Farial Shahnaz; Michael W. Berry; Robert J. Plemmons

doi:10.1137/1.9781611972740.45

ScienceGate Book Chapters

JOURNAL ARTICLE

Text Mining using Non-Negative Matrix Factorizations

V. Paúl Pauca Farial Shahnaz Michael W. Berry Robert J. Plemmons

Year: 2004

DOI: 10.1137/1.9781611972740.45

Get Full-Text PDF Get Analytical Report

Abstract

Previous chapter Next chapter Full AccessProceedings Proceedings of the 2004 SIAM International Conference on Data Mining (SDM)Text Mining using Non-Negative Matrix FactorizationsV. Paul Pauca, Farial Shahnaz, Michael W. Berry, and Robert J. PlemmonsV. Paul Pauca, Farial Shahnaz, Michael W. Berry, and Robert J. Plemmonspp.452 - 456Chapter DOI:https://doi.org/10.1137/1.9781611972740.45PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract This study involves a methodology for the automatic identification of semantic features and document clusters in a heterogeneous text collection. The methodology is based upon encoding the data using low rank non-negative matrix factorization algorithms to preserve natural data non-negativity and thus avoid subtractive basis vector and encoding interactions present in techniques such as principal component analysis. Some existing non-negative matrix factorization techniques are reviewed and some new ones are proposed. Numerical experiments are reported on the use of a hybrid NMF algorithm to produce a parts-based approximation of a sparse term-by-document matrix. The resulting basis vectors and matrix projection can be used to identify underlying semantic features (topics) and document clusters of the corresponding text collection. Previous chapter Next chapter RelatedDetails Published:2004ISBN:978-0-89871-568-2eISBN:978-1-61197-274-0 https://doi.org/10.1137/1.9781611972740Book Series Name:ProceedingsBook Code:PR117Book Pages:xiv + 537Key words:text mining, non-negative matrix factorization, clustering, dimension reduction, semantic feature identification

Keywords:

Computer science Matrix decomposition Non-negative matrix factorization Matrix (chemical analysis) Basis (linear algebra) Cluster analysis Artificial intelligence Data mining Mathematics

Metrics

308

Cited By

7.39

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Rough Sets and Fuzzy Logic

Physical Sciences → Computer Science → Computational Theory and Mathematics

Neural Networks and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Face and Expression Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Text Mining using Non-Negative Matrix Factorizations

Abstract

Metrics

Citation History

Topics

Related Documents

Volume regularized non-negative matrix factorizations

Credit Risk Analysis Using Sparse Non-negative Matrix Factorizations

GPU-Accelerated Non-negative Matrix Factorization for Text Mining

Improving non-negative matrix factorizations through structured initialization

Robust Image Hashing Via Non-Negative Matrix Factorizations