Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

What the Internet Can Tell Us About Flu Season

BY Miranda Neubauer | Friday, February 1 2013

In the past few weeks, if you had a stuffy nose, felt a fever coming on or were experiencing a bad headache, it is possible that you took to Google to look up information for "flu like symptoms." In fact, a recent Pew study found that 77 percent of Internet searchers in the U.S. start their online search for health information with a search engine. A review of Google Trends queries in the Health category for the past 90 days shows rises in terms like flu symptoms, pneumonia, bronchitis and RSV (Respiratory syncytial virus). But in entering those queries, Internet searchers can not only confirm for themselves whether they have the flu or not. They are also part of new kind of public health experiment that might become increasingly useful abroad, in countries where access to the Internet is improving but access to health care is slower to arrive.

Google Flu Trends data has provided good estimates in the past about how bad a flu season really was, although its model — which tracks search queries to guess if the user has the flu — was never meant to replace statistics from the Centers for Disease Control, which tracks the number of people who show up at their doctor's office to receive treatment for the flu. As of early January, the CDC was predicting a flu prevalence of just under five percent. Google was predicting a prevalence of 10 percent — sparking that worst-flu-season-ever talk that's got everyone so concerned. Writing for Slate, Will Oremus observes that Google's Flu Trends data isn't just often right — it's right well before the CDC data is ready to share. But through January so far, it looks like this is the year Google's algorithms are going to be a little off — and it's a little too soon to use search data to decide whether it's time to break out the hazmat suit.

Counting people who show up at the doctor because they're sick is pretty simple, even if it takes more time for the numbers to come in. Mining search data, on the other hand, can be complicated. In September 2009, as swine flu fears swept the world, Google happened to be changing the model it used to predict prevalence of flu. Evaluating the new model against the old one and against the CDC's methodology, Google found that the old way it had been doing things would not have correctly predicted the early days of the swine flu outbreak.

A research paper, authored by Google employees and a CDC employee, explained the reasons in detail. Evidence suggested people were seeking out medical care more readily, which may have affected the CDC's numbers. Meanwhile, they had also changed their search behavior. And Google's new flu-prediction model looked for fewer words related to complications from the flu, but more words related to symptoms. While the original model performed well during the second wave of that outbreak, the new one did well both times.

The paper notes that in the early stages of the swine flu, "the proportion of outpatient visits due to ILI captured in ILINet was slightly elevated (61%) compared with Wave 2 (43%), due to ill persons more readily seeking health care for relatively mild illness during the first weeks of pH1N1." The paper also notes that the researchers excluded a few weeks during that outbreak because of "tremendous media attention."

So "how bad is flu season" has different answers depending on how you count the numbers, and both models leave room for error. This adds uncertainty to the fear Slate's Will Oremus captured when he wrote:

"[T]he really ominous chart is the one that shows the trend line for the nation as a whole. It roughly agrees with the CDC that flu activity in December was about in line with the 'moderately severe' peak in 2007-2008. But if Google is right, the CDC's snapshot came just as the outbreak was gaining steam. Since mid-December, the trend line has rocketed past that of all previous years and now towers over that of the October 2009 H1N1 pinnacle, suggesting a CDC outpatient surveillance figure of an unprecedented 8.9 percent."

All of this has people asking another question: How accurate is Google Flu Trends? On Quora, MIT computer science graduate student Keith Winstein writes that rather than predicting an epidemic the CDC might not have seen coming, it looks more like Google's model is broken again:

At this point, it appears likely that Google Flu Trends has considerably overstated this year's flu activity in the U.S. But we won't be able to draw a firmer conclusion until after the flu season has ended. I don't know why the model broke down this year but am eager to learn, when and if Google comes to a similar conclusion. For now, I suspect this episode may provide a cautionary tale about the limits of inference from "big data" and the perils of overconfidence in a sophisticated and seemingly-omniscient statistical model.

Matt Mohebbi, a Google engineer who works on Google Flu, says that it is much too early to draw any conclusion about the accuracy of Google Flu and how it reflects this year's flu season. He notes that the CDC in general is one or two weeks behind Google with its data, there are additional delays at this time since the early season coincided with the holidays, and some reporting sites have been experiencing longer than normal delays, likely due to the large amount of cases. In addition, the CDC's data is also often adjusted retroactively.

Mohebbi added, though, that data from New York City through January 15, which he said has one of the best electronic surveillance systems, was reporting about about a four to five percent increase in cases over the number in September, which he said corresponded with what Google was seeing.

And Google Flu Trends was supposed to be complementary to the CDC's data, Mohebbi said, not a predictive tool to replace it.

But some research indicates that Google Flu Trends could make a significant contribution to epidemic detection. Umair Saif is a Pakistani computer scientist who is an associate professor at the Lahore University of Management Sciences and has also taught at the Computer Science and Artificial Intelligence Laboratory at MIT. Saif leads the Dritte initiative, which is focused on using technology to aid the developing world. Last year, he helped author a research paper on how Google Flu Trends could contribute to early epidemic detection, and his team also built a system to implement that research, called FluBreaks.

"Our analysis showed that adding a layer of computational intelligence to Google Flu Trends data provides the opportunity for a reliable early epidemic detection system that can predict disease outbreaks in advance of the existing systems used by the CDC," he wrote in an e-mail to techPresident. "We present an early investigation of algorithms to translate data from services such as Google Flu Trends into a fully automated system for generating alerts when the likelihood of epidemics is quite high."

FluBreaks translates Google search query volume into epidemic alerts, Saif told techPresident. The result, he says, is "a near real-time alternative to conventional disease surveillance networks."

But Saif also explained that futher work was necessary to fully realize the potential of Google Flu Trends in this area. First, raw search data needs to be put through some algorithmic paces before health professionals might be able to use it in decision-making, he said, but the possibility is there.

"Second, there is also a need to develop a more detailed appreciation of how changes in population size and Internet penetration affect the ability of a system based on Google Flu Trends data to provide accurate and actionable information," he wrote.

And there are other approaches as well. Voice of America recently reported on an effort led by Boston Children's Hospital epidemiologist John Brownstein to show the prevalence of the flu by pinning its prevalance on a Flu Near You map. And a new study from Brigham Young University looks at how Twitter could help track the flu.

While no one's saying it's time to stop washing your hands, taking one look at Google and saying it's swine flu all over again is an overstatement.

News Briefs

RSS Feed friday >

Chilean Anti-Corruption Resource: A Crowdsourced Database of Social and Political Connections

In countries where a small minority of social circles have a majority of the political and economic power, personal relationships can affect major decision-making, a serious concern of anti-corruption activists. A new web platform stores personal profiles of key players in Chilean business and politics, complete with biographies and personal and professional connections through family, education, social circles, employers and coworkers, to make tracking social relationships and conflict-of-interest easier. Called Poderopedia (from the Spanish word for power), the project sounds kind of like LinkedIn, but the creation and management of profiles is being crowdsourced out to journalists, activists and concerned citizens.

GO

Middle Eastern Telecom Accused of Working With Saudi Arabia to Spy on Citizens

Mobily, an arm of the state-owned Middle Eastern telecom giant Etihad Etisalat, has been accused of working with Saudi Arabia to develop software that would allow the government to bypass protections for social media users. The exposé comes from Moxie Marlinspike (neé Matthew Rosenfield), an expert in a certain type of malicious Internet attack called MITM (man-in-the-middle), whereby attackers intercept and secretly alter private messages exchanged via email and other social media platforms. GO

Saudi Religious Leader Warns Twitter Users of Consequences in the Afterlife

In late March, Saudi Arabia's top religious cleric said Twitter was for clowns and corrupters. Earlier this week, he said anyone using social media, in particular Twitter, “has lost this world and the afterlife.” His comments might be laughable, if they did not come at a time when the Saudi government is looking into monitoring or blocking social media sites and eliminating user anonymity.

GO

thursday >

What The Other Silicon Valley Immigration Group Is Doing This Month

A bipartisan coalition of political advocacy, business and tech groups are moving ahead to launch a social media blitz next week designed to persuade members of the Senate to vote in favor of immigration reform legislation supported in Silicon Valley. "We're going to create a virtual digital storm," said Jeremy Robbins in a Wednesday ... GO

The New Yorker Hopes "Strongbox" Is a Wiretap-Proof Sieve for Leaks

The New Yorker yesterday became the first outlet to implement DeadDrop, a new system for sources to submit information to journalists online in a more secure and anonymous way than, for example, email. GO

Female Organizer of Pakistan's First Hackathon Stresses Collaboration Over Competition

After Pakistan banned Valentine's Day this year, Sabeen Mahmud started an online protest in which people uploaded photos to mock the government ban. In the weeks following she received death threats and menacing phone calls, and early on she had to stay home from work. That did nothing, however, to keep her from further organizing. Last month, the café she started in Karachi hosted Pakistan's first ever hackathon, which tackled problems including sanitation, crime, disaster management, and education. She even invited a government representative to observe the initial conversations, tackling sensitive areas like government inefficiency and elections.

GO

wednesday >

White House Innovation Fellows Project Spins Off Into A Business

Clay Johnson and Adam Becker joined the Presidential Innovation Fellows program to help the White House fix the way government does business. Now they're turning that mission into a business themselves. GO

Fighting Fires With Data, New York City Launches New Safety Inspection System

Mayor Michael Bloomberg announced today that New York City has implemented city-wide a new risk based inspection system focused on fire safety that is driven by analytics from multiple city agencies. GO

Chinese Netizens Use Digital Initiative to Gain Media Attention for Unsolved Poisoning Case

Last month a medical science student at a Shanghai university died from poisoning, allegedly murdered by his roommate. The specifics of the crime echoed a case from the mid-1990s, in which a 19-year-old student was poisoned with thallium. That case has once again been thrown into the media spotlight, but after 18 years the media has changed and the spotlight means a trending hashtag on Sina Weibo or an online petition to the U.S. President.

GO

PDF France 2013: “Au Code, Citoyens!”

This year PDF France will take place in Paris on June 13, with the theme "Au Code, Citoyens!" ("To Code, Citizens!") The speakers' lineup includes some of the continent's leaders in the digital revolution. GO

tuesday >

Website Imitation is Flattery in New York City Council Race

A New York City Council candidate who had made his name as a technology consultant and spearheaded an open government initiative several years ago found parts of his website copied by another City Council candidate in a different borough, as Politicker first reported. GO

Mike Honda Locks Up Establishment Support, But Challenger Has Ear of the Silicon Valley Elite

Some of Silicon Valley's most influential business people will hold a fundraiser in San Francisco this Thursday for Ro Khanna, the 36-year-old lawyer who's challenging 71-year-old California Democrat Mike Honda for his 17th Congressional District seat. The names at the top of the invite: Ron Conway and Sean Parker. They're apparently forming a committee to help Khanna build his campaign. The other bold-face names who are listed as part of the 'committee in formation' include Salesforce.com's Founder and CEO Marc Benioff, Benchmark Capital General Partners' Matt Cohler and Peter Fenton, tech entrepreneur Shawn Fanning, Yahoo CEO Marissa Mayer, her big data venture investor husband Zach Bogue, and Conway's SV Angel colleague, Founder and Managing Partner David Lee. GO

Tools to Keep Independent Media Online in Hostile Environments

Websites and media outlets in developing countries or countries with corrupt or repressive regimes struggle daily to fend off hacker attacks, some from their own government — like the Malaysian news portal Sarawak Report, which techPresident reported was taken down in April by sustained denial-of-service attacks. The negative attention controversial reporting draws can scare local advertisers away as well, making it difficult for a media company to support itself. Media Frontiers offers two services to websites dealing with either of those problems.

GO

monday >

Ahead of September Elections, German Pirate Party Picks Its Platform

The German Pirate Party held its election year convention over the weekend and approved its party platform, following lengthy debate over the role that online decision-making should have within the party, as German news sources reported and the party outlined on its own web platforms. GO

Peruvians Petition their President to Stick Up for their Digital Rights

Peru’s civil society advocacy groups have started an online petition outlining their ‘non-negotiable’ demands for digital rights and freedom of speech. The campaign was prompted by the controversial Trans-Pacific Partnership (TPP) agreement. Lima, Peru, will soon host the 17th round of secretive TPP trade talks, which will take place from May 15 – 24.

GO

Gun Control Advocates Take Aim At LivingSocial for Promoting Guns and Alcohol

A coalition of advocacy groups is launching a new campaign this week against the promotion of American gun culture. The campaign focuses on the daily deals site Living Social, which hasn't stopped promoting social events Hunter S. Thompson would have loved (they promote shooting off guns and letting off steam and drinking.) GO

More