JOURNAL ARTICLE

Effective multimodal hate speech detection on Facebook hate memes dataset using incremental PCA, SMOTE, and adversarial learning

Emmanuel Ludivin Tchuindjang TchokoteElie Fute Tagne

Year: 2025 Journal:   Machine Learning with Applications Vol: 20 Pages: 100647-100647   Publisher: Elsevier BV

Abstract

The proliferation of harmful information, such as hate speech and online harassment, has increased in recent years due to social media's explosive expansion. Using the Facebook Hate Meme Dataset (FBHM), we create a reliable model in this work for identifying multimodal hate speech on online platforms. To effectively address class imbalance and improve classification accuracy, our hybrid model combines ResNet for image processing with RoBERTa for text analysis, leveraging Synthetic Minority Over-sampling Technique (SMOTE) and Incremental Principal Component Analysis (PCA) combined with adversarial machine learning techniques. The combination of Incremental PCA's dimensionality reduction and SMOTE's synthetic sample creation produces a potent combination that enhances the training dataset and maximizes feature representation, resulting in improved online content moderation techniques. We achieved an accuracy of 81.80 %, and a Macro-F1 score of 81.53 % on the FBHM dataset which represents an 18 % improvement in accuracy over the base model. These results provide significant novel insights into this important field of study by demonstrating the potential of adversarial approaches in creating reliable models for automated hate speech identification that can help create a safer online environment and can significantly reduce the emotional burden on human content moderators by handling the contents quickly and accurately. This study highlights the mutually beneficial effect of combining SMOTE and incremental PCA, demonstrating how they improve the model's ability to correct class imbalance and boost performance. The source code and dataset are publicly available on GitHub to facilitate reproducibility and further research. Link to the code and dataset below:https://github.com/ludivintchokote/HatePostDetection

Keywords:
Adversarial system Voice activity detection Computer science Artificial intelligence Speech recognition Speech processing

Metrics

2
Cited By
9.64
FWCI (Field Weighted Citation Impact)
15
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Hate Speech and Cyberbullying Detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Linguistics and Language Analysis
Social Sciences →  Arts and Humanities →  Language and Linguistics
Bullying, Victimization, and Aggression
Social Sciences →  Psychology →  Social Psychology
© 2026 ScienceGate Book Chapters — All rights reserved.