Data Mining

Nate Silver’s Sweep Is a Huge Win for ‘Big Data’

The data utopia awaits.
natesilvermap Nate Silvers Sweep Is a Huge Win for Big Data

(Illustration: ilovecharts.tumblr.com)

Like “pivot” and “cloud computing,” “big data” is one of those startup buzzwords that gets thrown around indiscriminately–partly because it means different things depending on the intel you’re trying to unearth and partly because it sounds like the kind of futuristic jargon that opens doors. Using machine learning to analyze big data? We can practically see the pitch deck already!

As The Economist noted back in 2010, the deluge of large data sets unleashed by the digital age, “makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account.”

In practice, however, the young science has been used primarily for the first example. But Nate Silver’s astounding record in predicting last night’s elections might change that and gives big data enthusiasts a concrete, laudable example of its potential on the national stage.

As one New York Times staffer told “the King of Quants“: “Obviously a great night for the president, but also a great night for you and your forecast model, which is performing pretty much perfectly right now!”

Quartz’s Christopher Mims has noted a difference between Mr. Silver’s use of statistics and the still-nebulous definition of big data. But that didn’t stop investor and entrepreneur Mark Birch from making his own prediction this morning:

You may be lulled into the idea that somehow the next four years are going to be an idyllic period of sweeping liberalization and legislative purpose.  However you are wrong.  The big story is that big data won.

Political science professor John Sides had a similar observation, calling Barack Obama’s victory, “a victory for the Moneyball approach to politics.”

It shows us that we can use systematic data—economic data, polling data—to separate momentum from no-mentum, to dispense with the gaseous emanations of pundits’ “guts,” and ultimately to forecast the winner.  The means and methods of political science, social science, and statistics, including polls, are not perfect, and Nate Silver is not our “algorithmic overlord” (a point I don’t think he would disagree with).

The importance of empiricism had already been cresting, as Bit.ly’s chief scientist Hilary Mason noted earlier:

And like all good empiricists, technologists are wary of crowing data itself king, as Mr. Birch noted:

Data can be corrupted, the datasets incomplete, the analysis methodologies flawed, and expansive conclusions incorrectly drawn from results.  Data is not an excuse for lazy thinking or shortcuts.  We need to dive in deep and take the time to understand fully what the numbers are truly telling us and if the numbers are even accurate.  However, I would rather that we rely on data over mere conjecture and opinions spinning.  At least one can correct for data veracity and methodology and present the process in a transparent manner.

But as even Instagram, which has become vital in everything from the Empire State Building shooting to peer pressuring the vote, attempts to redefine itself as a big data company, Mr. Silver sets an example to aspire to, both in terms of rigor of the his methods and the impact of his work.

Follow Nitasha Tiku on Twitter or via RSS. ntiku@observer.com