Do you trust big data? Try Googling the holocaust



The advent of big data was supposed to usher in a more precise and rational world. Instead, it might be leading us into the swamp of “alternative facts.”
Data may not lie, but they can be interpreted in ways that have the same effect. Consider President Donald Trump’s persistent claim that millions voted fraudulently in the 2016 election. In a twisted way, it might be based on data: In 2012, a study found that some 2.75 million people were registered to vote in two states or more. Although there’s zero evidence that any of them actually voted twice, that doesn’t matter to Trump or his supporters.
This sleight of hand both illustrates and contributes to a bigger problem: We’re losing trust in numbers, especially statistics. Their sheer volume and variety can be overwhelming. In Politico’s recent roundup of Trump’s popularity figures, for example, the approval numbers among nine polls ranged from 36 percent to 54 percent. Add the hangover that many still suffer from the misleading presidential election predictions, and it’s not surprising that people are starting to tune out data altogether, or simply interpret them in ways that support their beliefs.
I don’t know whether this will lead to a full-blown crisis of democracy, but I think it’s already fair to place at least some of the blame on big data. Algorithms developed by companies such as Google parent Alphabet Inc. and Facebook Inc. enable partisan confirmation bias. They tailor our online environments not to the truth, but to the specific information we search for or click on. This can undermine our understanding of, and trust in, objective scientific and historical facts.
Here’s an extreme example: Dylann Roof claimed in his manifesto that it was a Google search for “black on white crime” that led him to massacre nine people in a Charleston, South Carolina church in 2015. Think about that search term. What kinds of texts will perfectly match “black on white crime,” as opposed to, say, “statistics on crime rates by race?” Naturally, Roof got links to racist web sites with their own alternative facts — just as a search for “who really killed JFK” will, more often than not, lead to conspiracy sites.
When I typed the phrase “Was the Hol” into Google, the search engine auto-completed to “Was the Holocaust real?” Of the top six search results, four were Holocaust-denying sites. That’s despite Google’s efforts to address this problem back in December, and I’m not unique.
Such steering happens even when we’re not actively searching. The Wall Street Journal, in the lead-up to the election, kept track of the Facebook news feeds of people on the left and on the right. In one feed Clinton was evil, in the other Trump was. The Guardian experimented with placing Democratic voters in the right-wing feed and Republican voters on the left. Being in the liberal feed, one Republican voter said, was “like reading a book by a fool.” Different stories buttressed by different kinds of evidence, some of it very weak, have led to increasingly non-intersecting world views. Worse, checking what you saw on Facebook by going to Google might not help — it depends on how you phrase the question.
Despite these problems, big data companies try to maintain reputations as sources of reliable information. In a recent advertisement for Google Home, a father reads to his daughter and asks Google to look up facts about blue whales. It’s portrayed as a sort of handy encyclopedia, an extension of trustworthy household knowledge like the modern day equivalent of the kitchen dictionary. Let’s just hope little sweetie doesn’t ask about the Holocaust.

Cathy O’Neil is a mathematician who has worked as a professor, hedge-fund analyst and data scientist. She founded ORCAA, an algorithmic auditing company, and is the author of “Weapons of Math Destruction.”

Leave a Reply

Send this to a friend