Tweet that Stuffy Nose
February 20, 2013
Editor’s note: Mentored learning is a donor-supported activity that provides research and employment experience for students at BYU. Such experience enhances opportunities for employment and graduate school acceptance. We are grateful to those who see the value in investing in education.
Research from Brigham Young University says posts on Twitter could actually be helpful to health officials looking for a head start on disease outbreaks.
BYU undergraduate Kesler Tanner is a co-author on the study. He also wrote the code to obtain the data from Twitter. One big issue was whether there was enough GPS data to plot the geographic location of an outbreak.
The researchers found surprisingly less data than they expected from Twitter’s feature that enables tweets to be tagged with a location. They found that just 2 percent of tweets contained the GPS info. That’s a much lower rate than what Twitter users report in surveys.
The study sampled 24 million tweets from 10 million unique users. They learned that accurate location information is available for about 15 percent of tweets (gathered from user profiles and tweets that contain GPS data). That’s likely a critical mass for an early-warning system that could monitor terms like “fever,” “flu” and “coughing” in a city or state.
“One of the things this paper shows is that the distribution of tweets is about the same as the distribution of the population so we get a good representation of the country,” said BYU professor Christophe Giraud-Carrier. “That’s another nice validity point especially if you’re going to look at things like diseases spreading.”
“There is this disconnect that’s well known between what you think you are doing and what you are actually doing,” Giraud-Carrier said.
Location info can more often be found and parsed from user profiles. Of course some people use that location field for a joke, i.e. “Somewhere in my imagination” or “a cube world in Minecraft.” However, the researchers confirmed that this user-supplied data was accurate 88 percent of the time. Besides the jokes, a portion of the inaccuracies arise from people tweeting while they travel.
The net result is that public health officials could capture state-level info or better for 15 percent of tweets. That bodes well for the viability of a Twitter-based disease monitoring system to augment the confirmed data from sentinel clinics.
“The first step is to look for posts about symptoms tied to actual location indicators and start to plot points on a map,” said Scott Burton, a graduate student and lead author of the study. “You could also look to see if people are talking about actual diagnoses versus self-reported symptoms, such as ‘The doctor says I have the flu.’”
The computer scientists collaborated with two BYU health science professors on the project. Professor Josh West says speed is the main advantage Twitter gives to health officials.
“If people from a particular area are reporting similar symptoms on Twitter, public health officials could put out a warning to providers to gear up for something,” West said. “Under conditions like that, it could be very useful.”
Earlier this year, this same group of researchers published a study showing that most exercise apps are based on bad info.