JOURNAL ARTICLE

Multilingual Hate Speech Detection: A Semi-Supervised Generative Adversarial Approach

Khouloud MnassriReza FarahbakhshNoël Crespi

Year: 2024 Journal:   Entropy Vol: 26 (4)Pages: 344-344   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Social media platforms have surpassed cultural and linguistic boundaries, thus enabling online communication worldwide. However, the expanded use of various languages has intensified the challenge of online detection of hate speech content. Despite the release of multiple Natural Language Processing (NLP) solutions implementing cutting-edge machine learning techniques, the scarcity of data, especially labeled data, remains a considerable obstacle, which further requires the use of semisupervised approaches along with Generative Artificial Intelligence (Generative AI) techniques. This paper introduces an innovative approach, a multilingual semisupervised model combining Generative Adversarial Networks (GANs) and Pretrained Language Models (PLMs), more precisely mBERT and XLM-RoBERTa. Our approach proves its effectiveness in the detection of hate speech and offensive language in Indo-European languages (in English, German, and Hindi) when employing only 20% annotated data from the HASOC2019 dataset, thereby presenting significantly high performances in each of multilingual, zero-shot crosslingual, and monolingual training scenarios. Our study provides a robust mBERT-based semisupervised GAN model (SS-GAN-mBERT) that outperformed the XLM-RoBERTa-based model (SS-GAN-XLM) and reached an average F1 score boost of 9.23% and an accuracy increase of 5.75% over the baseline semisupervised mBERT model.

Keywords:
Computer science Generative grammar Artificial intelligence Natural language processing Obstacle Baseline (sea) Generative adversarial network Scarcity German Adversarial system Language model Offensive Social media Machine learning Linguistics Deep learning World Wide Web

Metrics

10
Cited By
6.39
FWCI (Field Weighted Citation Impact)
43
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Hate Speech and Cyberbullying Detection
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Multilingual Hate Speech Detection Using Semi-supervised Generative Adversarial Network

Khouloud MnassriReza FarahbakhshNoël Crespi

Journal:   Studies in computational intelligence Year: 2024 Pages: 192-204
JOURNAL ARTICLE

Semi-meta-supervised hate speech detection

Cendra Devayana PutraHei‐Chia Wang

Journal:   Knowledge-Based Systems Year: 2024 Vol: 287 Pages: 111386-111386
JOURNAL ARTICLE

Semi-supervised generative adversarial networks for anomaly detection

Juan Manuel Fernández MontenegroYeojin Chung

Journal:   SHS Web of Conferences Year: 2022 Vol: 132 Pages: 01016-01016
JOURNAL ARTICLE

Multilingual Hate Speech Detection

Λαυρεντιάδου, Βασιλική Γεωργίου

Journal:   Aristotle University of Thessaloniki Year: 2022
DISSERTATION

Multilingual hate speech detection

Aymé Arango Monnar

University:   Repositorio Institucional Year: 2025
© 2026 ScienceGate Book Chapters — All rights reserved.