Dimensionality reduction is an important technique in machine learning and data mining, which makes the processing of high dimensional data faster. An efficient method for dimensionality reduction can find a low-dimension feature subset extracting the most relevant information. The dimensionality reduction methods based on neural network are applied to all kinds of data, especially computer vision data. In this paper, we focus on the text data with high sparse and high dimension, then reduce its dimension by using the variational auto-encoder. The performance of variational auto-encoder in dimensionality reduction is observed by comparison test. First, unstructured text data is converted to computer-processable vectors using term frequencyCinverse document frequency. Then variational auto-encoder is used to reduce the dimensionality. Finally, the experiment verifies the efficiency of variational auto-encoder by comparing seven commonly used dimensionality reduction methods.
Genggeng LiuLin XieChi‐Hua Chen
Haojin HuMengfan LiaoWeiming MaoWei LiuChao ZhangYanmei Jing