JOURNAL ARTICLE

Twitter-user recommender system using tweets: A content-based approach

Abstract

With the advent of the internet into our everyday lives, online social networks such as Facebook and Twitter have taken up a major role in networking, information deployment and entertainment. As of 2017, Twitter's outreach is over 317M monthly active users generating more than 320M tweets every day, thus making it one of the fastest information deployment mediums of this era. In order to aid data distribution without causing a glut of information to the users, we develop a recommender system focusing on a vital aspect of social media - relationships among users, addressing a popular problem of users - who to follow/befriend? By choosing the right accounts and users to follow, the sources of information can be controlled as desired. The information collected from the most recent tweets of a user is used to find other users whose recent tweets contain similar information, ensuring there is at least one mutual friend among users. By making use of the continuous and real time updating of data on social networks, we develop a method to ensure our training sets consist of relevant information for classification, thus preserving accuracy while reducing training set sizes for probabilistic learning models. We use two algorithms to detect tweets of common topics, namely a Noun Phrase detector and a Naïve Bayes Text (Topic) Classifier and further compare their complexity and accuracy. The Naive Bayes Classifier, despite being probabilistic, functioned well with a relatively small training set. This is only with the exception of Twitter as it is a real-time updating framework. Exact matches were hard to obtain with the Noun phrase detector, as we are going only one level deep due to limited compute. However, when matches were found, it is upto 90% accurate. Experiments on tweets of random public users have found that Naive Bayes Classifier with a small but recent training data set can work as well as or better than a Collaborative filter without the assumptions of the Collaborative model.

Keywords:
Computer science Recommender system Naive Bayes classifier Social media Classifier (UML) Blogosphere Probabilistic logic Artificial intelligence Machine learning Learning to rank Information retrieval World Wide Web The Internet Ranking (information retrieval) Support vector machine

Metrics

13
Cited By
1.38
FWCI (Field Weighted Citation Impact)
14
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Complex Network Analysis Techniques
Physical Sciences →  Physics and Astronomy →  Statistical and Nonlinear Physics
Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.