re: The Big Spill and Enviro Web Traffic [Guest Post]
BY Editors | Tuesday, June 22 2010
Josh Nelson is a blogger, activist and new media consultant. We're pleased to host his commentary here on the blog. -- the editors
techPresident's Micah Sifry published a piece on Friday entitled, The Big Spill and the Enviro Group of Ten: Why Isn't Their Web Traffic Surging? In it, he compares Compete.com's website traffic data for ten environmental organizations in April and May of this year. Relying exclusively on this data, he concludes that their web traffic is not increasing, and goes on to speculate on what might be to blame for this phenomenon. One problem: the data the analysis is based on is extremely limited and potentially inaccurate. I appreciate Micah allowing me the opportunity to share my thoughts directly with this forum.
I'm not here to argue that Sifry's conclusions are necessarily incorrect, only that the way he came to them raised a number of red flags. From the type of data he chose to include (and not include), to the source of the data, there are various shortcomings. Some of these flaws could have been avoided by choosing different inputs. Others may be a function of trying to draw broad conclusions from a narrow set of data. I've outlined some of the more glaring problems below.
Red Flag 1: Using Compete.com Data as a Proxy for Web Traffic
All of the data Sifry incorporated in his analysis came from compete.com. Any website usage data from third-party services like this should be taken with a huge grain of salt. Since Compete relies heavily on Internet users who have downloaded and installed their toolbar, their traffic data consistently skews geek. As Social Media Explorer explains, "The sample also includes Compete toolbar users: People who have gone through the trouble of downloading a toolbar plug-in from Compete specifically to volunteer their web surfing data for measurement. There is an inherent bias in this portion of their data because it will skew geek. Generalizing a bit, only computer nerds and webmasters are going to download this toolbar."
Not only is the data not accurate, it is also not consistent with more reliable metrics tools like Google Analytics or Sitemeter.
For an example of this problem in action, let's look at techPresident's stats for February and March of this year, comparing what Compete tells us to what techPresident's Sitemeter account tells us.
According to Compete, techPresident traffic jumped from 39,027 visits in February 2010 to 78,901 visits in March, an increase of over 100%. According to Sitemeter however, visits increased from 30,663 to 38,650 in the same time period, an increase of just 26%.
Sifry skimmed over this problem in his piece, writing, "It's also possible that Compete's data is just...odd." Despite this, the analysis relied exclusively on Compete's flawed data.
Red Flag 2: Using April as a Baseline
Using April 2010 as a baseline for web traffic in this analysis is problematic for at least two reasons: the disaster in the Gulf began on April 20th and Earth Day was April 22nd.
In a piece about the impact of a disaster that began on April 20th it makes no sense to compare April to May. If you are limited to making month-to-month comparisons (as you are with Compete's free service), you'd want to compare the first full month when the spill was not a factor to the first full month in which it was. Using April as a baseline assumes the disaster didn't impact web traffic at all for the first 10 days. This a risky assumption to make.
Additionally, as Sifry noted, "some of these groups did a big push around Earth Day, April 22nd, and thus might be showing a decline from a high-water mark." Indeed, according to the very same Compete.com data Sifry examined, eight of these ten websites (all except Izaak Walton League and Wilderness Society) experienced their highest-trafficked month of the year-to-date in April. Despite this, Sifry decided to use April's inflated numbers as the baseline for his traffic comparison. Using Compete's numbers from March, the most recent full month in which the oil spill was not a factor, May traffic was in fact up considerably for many of the organizations in question:
And while Sifry's April vs. May comparison shows a "paltry 3.3% increase, from 2.19 million to 2.27 million," using the non-inflated March numbers as a baseline, the increase is an impressive 17% in just two months:
Does this prove that web traffic for these environmental groups actually increased as a result of the spill, in contradiction to Sifry's findings? No, not necessarily. While I've eliminated the problems associated with using April as a baseline, I'm still vulnerable to most of the same flaws. I'm still using Compete's questionable data. I'm still relying on too few data points. Or perhaps traffic in March was low for some other reason, skewing the results in the other direction (a rolling average would be better). For these reasons and probably several others, this analysis is also flawed to the point that it doesn't actually tell us much.
That is the point: while a quickie-analysis of web traffic may make for a compelling blog post, it doesn't really tell us much about the reality of the situation. It would take far more data and a much more sophisticated analysis to say with any degree of confidence how much traffic the oil spill generated for these websites. Bottom line: if you use flawed assumptions, unreliable data and an incomplete data set your analysis is going to reflect those shortcomings.
I've got a host of other concerns with the piece, which, in the interest of time, I'll only mention here briefly.
- Why would an analysis released on June 18th only use data through the end of May? Media coverage of the spill picked up considerably in June, and according to Google Trends, web searches for the phrase 'oil spill' peaked on June 4th.
- Why use only two data points for traffic data? Month-over-month changes could very easily be due to normal variation having nothing to do with the spill.
In speculating on the causes of the groups' perceived inability to translate the spill into web traffic, Sifry takes a few unsubstantiated cheap shots at the environmental groups he is studying:
"Many of these organizations are legacy institutions that still approach the web with a "fortress" mentality, to use Beth Kanter and Allison Fine's useful image."
Many? Which of these groups, exactly, is this referring to? Is it the groups that didn't see an increase in traffic according to this particular analysis? Some other combination of the groups? The piece doesn't say.
"We're the professional environmentalists, they've been saying to the public for years--leave the issue to us!"
There is clearly some personal animosity here, but I don't see what it adds to the analysis.
- Kombiz makes an important point, but the part about much of the traffic going to social media outlets doesn't get the attention it deserves. You can't credibly complain that groups have a fortress mentality and then use unique visitors on their main website as the only metric in your analysis.
Finally, much of the metrics industry has moved past these types of limited quantitative analyses. Large nonprofits are often more interested in influence than statistics. Accordingly, some of these groups aren't especially concerned with basic metrics like website visits. I'd rather have one visitor who called Congress than 500 who didn't, for example. Unique visitors (and all quantitative metrics) represent a means to an end. Might it have been more useful to look at which groups were having success influencing legislators on issues related to the spill? Perhaps someone found a way to achieve such influence without funneling unique visitors through their primary website.
While I'm glad to see techPresident looking at the effects of the oil spill on environmental group web traffic, I'm afraid Micah's analysis drew overly broad conclusions from unnecessarily limited data. Again, I'd like to thank Micah for giving me the opportunity to share my concerns.
Full disclosure: I was a full-time employee of the National Wildlife Federation for most of 2007 and part of 2008.