JOURNAL ARTICLE

Fake news detection in Urdu language using machine learning

Muhammad Shoaib FarooqAnsar NaseemFurqan RustamImran Ashraf

Year: 2023 Journal:   PeerJ Computer Science Vol: 9 Pages: e1353-e1353   Publisher: PeerJ, Inc.

Abstract

With the rise of social media, the dissemination of forged content and news has been on the rise. Consequently, fake news detection has emerged as an important research problem. Several approaches have been presented to discriminate fake news from real news, however, such approaches lack robustness for multi-domain datasets, especially within the context of Urdu news. In addition, some studies use machine-translated datasets using English to Urdu Google translator and manual verification is not carried out. This limits the wide use of such approaches for real-world applications. This study investigates these issues and proposes fake news classier for Urdu news. The dataset has been collected covering nine different domains and constitutes 4097 news. Experiments are performed using the term frequency-inverse document frequency (TF-IDF) and a bag of words (BoW) with the combination of n-grams. The major contribution of this study is the use of feature stacking, where feature vectors of preprocessed text and verbs extracted from the preprocessed text are combined. Support vector machine, k-nearest neighbor, and ensemble models like random forest (RF) and extra tree (ET) were used for bagging while stacking was applied with ET and RF as base learners with logistic regression as the meta learner. To check the robustness of models, fivefold and independent set testing were employed. Experimental results indicate that stacking achieves 93.39%, 88.96%, 96.33%, 86.2%, and 93.17% scores for accuracy, specificity, sensitivity, MCC, ROC, and F1 score, respectively.

Keywords:
Computer science Artificial intelligence Random forest Support vector machine Robustness (evolution) Machine learning tf–idf Urdu Stacking k-nearest neighbors algorithm Feature vector Natural language processing Term (time)

Metrics

21
Cited By
20.14
FWCI (Field Weighted Citation Impact)
27
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Misinformation and Its Impacts
Social Sciences →  Social Sciences →  Sociology and Political Science
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Fake News Detection in Urdu using Deep Learning

Farah RaufRoha IrfanLyba MushtaqMohsin Ashraf

Journal:   VFAST Transactions on Software Engineering Year: 2022 Vol: 10 (4)Pages: 151-167
JOURNAL ARTICLE

Fake News Detection Using Machine Learning

Preeti BarlaSarat Chandra Swain

Journal:   INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT Year: 2024 Vol: 08 (07)Pages: 1-11
JOURNAL ARTICLE

Fake News Detection using Machine Learning

Shagun KingaonkarAjinkya BawaneRuchi RanaJanhavi ThoolNikita Kale

Journal:   INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT Year: 2023 Vol: 07 (03)
© 2026 ScienceGate Book Chapters — All rights reserved.