Multi-label text classification (MTC) is a natural extension of the traditional text classification (TC) in which a possibly large set of labels can be assigned to each document. The dimensionality of labels makes MTC difficult and challenging. Several ways are proposed to ease the classification process and one of them is called the problem transformation (PT) method. It is used to transform the multi-labeled data into a single-label one that is suitable for normal classification. Our paper presents a detailed study about using the supervised approach to address the MTC problem for Arabic text. Moreover, the scalability of such an approach is considered in our experiments. The MEKA system is used to convert the multi-label data into a single-label one using different PT methods: LC, BR and RT. Then, different classifiers commonly used for TC such as SVM, NB, KNN, and Decision Tree, are applied to each dataset. The results show that using SVM on the LC dataset generated the best results with 71% ML-accuracy.
Nawal AljedaniReem AlotaibiMounira Taileb
Hozayfa El RifaiLeen Al QadiAshraf Elnagar
Ahmed OmarTarek M. MahmoudTarek Abd El‐HafeezAhmed Mahfouz
Hussain A. RahmanaSalwa Shakir Baawi