JOURNAL ARTICLE

Towards detecting influenza epidemics by analyzing Twitter messages

Abstract

Rapid response to a health epidemic is critical to reduce loss of life. Existing methods mostly rely on expensive surveys of hospitals across the country, typically with lag times of one to two weeks for influenza reporting, and even longer for less common diseases. In response, there have been several recently proposed solutions to estimate a population's health from Internet activity, most notably Google's Flu Trends service, which correlates search term frequency with influenza statistics reported by the Centers for Disease Control and Prevention (CDC). In this paper, we analyze messages posted on the micro-blogging site Twitter.com to determine if a similar correlation can be uncovered. We propose several methods to identify influenza-related messages and compare a number of regression models to correlate these messages with CDC statistics. Using over 500,000 messages spanning 10 weeks, we find that our best model achieves a correlation of .78 with CDC statistics by leveraging a document classifier to identify relevant messages.

Keywords:
Computer science The Internet Correlation Disease control Population Statistics Medicine World Wide Web Environmental health Mathematics

Metrics

616
Cited By
19.85
FWCI (Field Weighted Citation Impact)
27
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Influenza Virus Research Studies
Health Sciences →  Medicine →  Epidemiology
Data-Driven Disease Surveillance
Health Sciences →  Medicine →  Epidemiology
Misinformation and Its Impacts
Social Sciences →  Social Sciences →  Sociology and Political Science

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.