Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

[BackChannel] Why You Should Test (Almost) Everything

BY Benjamin Simon and Jim Pugh | Monday, May 6 2013

techPresident's Backchannel series is an ongoing conversation between practitioners and close observers at the intersection of technology and politics. Jim Pugh is the CEO of ShareProgress. He previously ran the digital analytics program at Organizing for America.
Ben Simon was formerly the director of new media campaigns at the DNC and OFA, and he is currently working as an independent consultant.

One of the best things to come out of the post-campaign coverage of OFA 2012 has been a renewed focus on analytics -- and in particular on randomized testing and experimentation -- as a crucial part of any good digital program. It’s something we’ve both been preaching for years, and we're excited to see its growing proliferating outside of just the largest programs.

Randomized testing is an incredibly valuable tool. It lets you use data to determine which messages resonate the most and drive your supporters to take action, rather than needing to make guesses based on your gut instinct (which we can attest will often be inaccurate). Applying the results of these tests can increase the impact of your digital program by a substantial margin.

However, it's important to recognize that there's an opportunity cost associated with any test you run. Even the simplest email subject line test requires time and effort to plan and execute. And more complicated tests take even more work -- to execute a 4x3 email test (four different emails with three subject lines each) requires writing four separate drafts, coding each one up separately, and analyzing a lot more response data.

There may be a credibility cost to testing as well. Many organization directors are still skeptical about the value of testing, so every experiment that takes time without yielding useful results could be a strike against the cause of testing more generally.

A good test can be well worth it -- and pay off handsomely by increasing the impact your program. But to gain useful, actionable results, your experiment needs to provide you with enough data to see statistically significant differences between the approaches that you're testing -- ideally with 95% confidence or more.

If you're the Obama campaign, MoveOn.org, or Avaaz and have an email list of millions of people to contact, it won't be hard to collect enough responses to reach this threshold. But for smaller organizations with more limited reach, it may be much more difficult.

How can you check in advance to see if you'll reach statistical significance? Here's what you need to do:

  1. Identify what it is you're trying to maximize. If this is an email, it should be your ultimate action (donations or petition signatures, for example), rather than simply opens or clicks. If it's a webpage test, it should be whatever action you want people to take on the page.
  2. Figure out how many people you intend to reach with each different approach that you're testing.
  3. Estimate, based on past performance, what percentage of the people you're reaching you expect to take action. This percentage will be very different depending on what your action is (for example, you'll probably have a much lower action percentage when asking people to make a donation than when asking them to sign a petition).
  4. Make an educated guess about how much difference in response you might see between your approaches. Will one version get 5% more actions than another? 10%? 30%?
  5. Using your estimates from the previous steps, calculate how many actions you expect for each of your different approaches, then plug your reach and action numbers into a statistical significance calculator to measure the expected confidence level of your results.

Is the difference between your approaches significant? If yes -- go ahead with your test! There's a good chance your results will be able to tell you which approach is best.

But if the answer's no? Then don’t run the test. Without statistical significance, it's very possible that the approach with the highest number of actions may not actually be the best one, and the test therefore doesn't provide you with useful information. The time it took to set up and run the experiment could have been more effectively spent elsewhere.

News Briefs

RSS Feed friday >

First POST: Overreaching

Why the FCC balked at the Comcast-TimeWarner deal; Sheryl Sandberg wants Hillary Clinton to lean into the White House; the UK's Democracy Club brings a lot more information to election season; and much, much more. GO

thursday >

First POST: Ownership

"Tell us more about your bog"; the shrinking role of public participation on campaign websites; "Aaron's Law" has been reintroduced in Congress; is the Comcast-TimeWarner merger on its last legs?; and much, much more. GO

wednesday >

First POST: Bush League

Presidential candidates hiding behind Super PACs; what this means for American democracy; demos at the White House; a demand for Facebook to be more open about news in the newsfeed; and much, much more. GO

tuesday >

First POST: Glass Half Full

A new Pew study on open government data in the US; the FOIA exemption ruffling transparency advocates' feathers; social media bot farms; and much, much more. GO

monday >

First POST: Zucked Up

Mark Zuckerberg responds to criticism of "zero rating" Facebook access in India; turning TVs into computers; how Facebook is changing the way UK users see the upcoming General Election; BuzzFeed's split priorities; a new website for "right-of-center women"; and much, much more. GO

More