JOURNAL ARTICLE

@Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage

Abstract

We study the problem of predicting home locations of Twitter users using contents of their tweet messages. Using three probability models for locations, we compare both the Gaussian Mixture Model (GMM) and the Maximum Likelihood Estimation (MLE). In addition, we propose two novel unsupervised methods based on the notions of Non-Localness and Geometric-Localness to prune noisy data from tweet messages. In the experiments, our unsupervised approach improves the baselines significantly and shows comparable results with the supervised state-of-the-art method. For 5,113 Twitter users in the test set, on average, our approach with only 250 selected local words or less is able to predict their home locations (within 100 miles) with the accuracy of 0.499, or has 509.3 miles of average error distance at best.

Keywords:
Computer science Gaussian Word (group theory) Artificial intelligence Set (abstract data type) Mixture model Maximum likelihood Test set Estimation Pattern recognition (psychology) Statistics Mathematics

Metrics

133
Cited By
23.99
FWCI (Field Weighted Citation Impact)
14
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Mobility and Location-Based Analysis
Social Sciences →  Social Sciences →  Transportation
Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing
Geographic Information Systems Studies
Social Sciences →  Social Sciences →  Geography, Planning and Development

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.