You are not logged in. LOG IN NOW >

What's In a (Standardized) Name

BY Nancy Scola | Monday, December 22 2008

Down in DC a few weeks ago, a friend of mine had the gall to say, "you know, you're not only a politics geek, you're a real geek geek." The nerve of the guy. This post isn't going to lessen my geek rep one iota, but whatever. What I have to report is pure awesome and I don't care who knows it. This morning, I was reading the Sunlight Foundation's Lab's director's Clay Johnson's blog post about what's next for the Labs, and a throwaway mention gave me that prickly sense down the back of my neck that I get when I know I've stumbled across something powerfully good: innovations in naming standardizations that will streamline fundraising reports, regulatory records, and more. Gadzooks! Does it get more exciting?

To realize how neat a prospect this is, you have to know what problem it solves. Here's the Sunlight wiki where the idea is being hashed over:

Names of entities-- donors, members of congress, corporations, even governments are not called the same thing between documents or databases or even in the same document. For instance, in the case of the Federal Election Commission data files, donors can be called William Smith, Billy Smith, Billy Smith, JR. or a plethora of other names. Corporations go beyond this by having multiple names-- Lorne Michaels is not only the executive producer for Saturday Night Live, but the CEO of Broadway Video and an employee of NBC Studios, a subsidiary of General Electric.

The fact that Jim Jones and James Jones III are one and the same person is a challenge to transparency, because if we never come to know that they're both the same guy, the quality of the data that powers good government drops considerably. And so, the Labs are trying to whip up algorithms and filtering techniques that boil names down to their most basic and consistent form. Once they crack that nut, they can share that knowledge with the rest of us. In some cases, Sunlight has already solved some aspects of the problem. An API now publicly available pulls members of Congress's names from a central database, so that typing "Teddy Kennedy" into a Google spreadsheet, for example, automagically resolves to "Edward Kennedy." Think that's awesome? Me too, me too.

My bedtime book of late has been doctor and New Yorker writer Atul Gawande's rather good Better: A Surgeon's Notes on Performance. I'm reminded here about his core argument: the most transformational changes in modern medicine are some of the simplest acts -- cutting down on hospital infection rates by getting doctors to wash their hands between patients, for example. It's often the most basic things that can be the most powerful.

*Note: Our Andrew Rasiej and Micah Sifry are senior advisors to the Sunlight Foundation, but that has little bearing on this post.