Large Language Model (LLM)-based transformers, such as Bidirectional Encoder Representations from Transformers (BERT), are currently gaining significant attention for various Natural Language Processing (NLP) tasks, such as machine translation, classification, and auto-completion. These transformer models demonstrate substantial performance improvements for text classification tasks. Multi-label classification problems often require more computation than binary and multi-class classification problems. Also, the computation requirements become more aggressive if large datasets are considered. Federated Learning (FL) offers a solution to train models in a distributed manner while preserving data privacy. This paper proposes a novel approach for building a machine learning model, which deals with a sizeable textual dataset for multi-label classification leveraging FL. FL has been used to train a compound model constructed by extending Bidirectional Encoder Representations from Transformers (BERT) with a "One-dimensional Convolutional Neural Network (1D CNN)". At first, The experiment was conducted in a single machine (Central) with the entire dataset. Then, the dataset was split into two groups, and the same experiment was performed in a Federated Learning fashion (BERT-FL Fusion). The FL setup considerably reduced the required computing power to derive an equivalent global model while increasing accuracy, precision, and F1 Score and minimizing Hamming Loss.
Omar GalalAhmed H. Abdel-GawadMona Farouk
Omar GalalAhmed H. Abdel-GawadMona Farouk
Omar GalalAhmed H. Abdel-GawadMona Farouk
Zhengyang LitShijing SitJianzong WangJing Xiao
Bhavana R. BhamareJeyanthi Prabhu