Abstract

Internet and social media usage has skyrocketed over the past two decades, changing how people communicate with one another on a basic level. Numerous favourable results have resulted from this. The risks and harms that come with it are also there. It is impossible for humans to control the amount of damaging content, such as hate speech, that is available online. Researching automated methods for hate speech identification has drawn more attention from academics. Through the creation of a single homogeneous dataset, we investigate various publicly accessible datasets in this work. We establish a baseline model and enhance model performance scores using various optimisation strategies after classifying them into two categories: hate or non-hate. After achieving a competitive performance score, we develop a tool that, using the same feedback, quickly locates and evaluates a page with an effective measure. This tool then retrains our model using the new data. In three languages: English, German, and Spanish. We demonstrate the superior performance of our multilingual approach. In comparison to most monolingual models, this results in performance that is equal to or better.

Keywords:
Offensive Computer science Linguistics Speech recognition Natural language processing Artificial intelligence Engineering Philosophy

Metrics

2
Cited By
0.51
FWCI (Field Weighted Citation Impact)
19
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Hate Speech and Cyberbullying Detection
Physical Sciences →  Computer Science →  Artificial Intelligence
Swearing, Euphemism, Multilingualism
Social Sciences →  Social Sciences →  Communication
Freedom of Expression and Defamation
Social Sciences →  Social Sciences →  Law
© 2026 ScienceGate Book Chapters — All rights reserved.