What this past year has taught me more than anything is this: First, you can learn a lot about statistics from politics; and second, both are misunderstood.
It has been an interesting year in politics, polling and statistics. In talking to most people, one would get the impression all three are broken. If one were to believe the polls, Brexit would not have happened and Donald Trump would not be the 45th president of the United States.
Does this mean all polls are broken, or we are signaling the end of the era of statistics? It is highly probable that the answer is no, and I contend that Big Data and a statistically driven society is not dead, it is just not yet fully understood.
So why do we keep getting it so wrong?
One major flaw may be that society is myopically focused on or obsessed with causation instead of correlation. The problem is that causation may be able to answer the question of “why” at a particular point in time, but it doesn’t always cleanly translate into what may occur in the future. Another issue may be that people don’t think anymore but instead just want others to tell them what to believe. With 24/7 news and information being fed to them in real time in the form of 140 characters, people’s opinions change with their moods. When it comes to the results of polling and sampling, we are not only seeing an unprecedented speed of change but also the sheer magnitude of change that can occur from poll to poll.
So what lessons can be learned?
First, politics and statistics can be huge.
Given the computing power available today, we now have extremely large sets of data that can be analyzed computationally to reveal patterns, trends and associations. With Big Data it may no longer be necessary to sample the population since we may have access to the entire population. In the simplest terms, this is what happened on Election Day when instead of sampling and getting a poll, we sampled everyone and got a totally different result.
But Big Data is bigger than that and offers even more opportunities. We are finding the more data used, the less one needs to worry about the precision of the data, and the greater the chance of finding new insights. This is unlocking a new era of data-driven psychological profiling and using this information to structure messaging to change moods and behaviors.
One does not need to go far to see this Big Data analytics revolution at work. Cambridge Analytica played a big role this past year by using huge data with cutting-edge analytics to drive results. Ted Cruz was the first to get on board, and it provided a data-driven win in Iowa. Later, in the general election, Mr. Trump, with his chief strategist Steve Bannon on the board of Cambridge Analytica, used similar techniques. As opposed to pollsters who tried to answer “why,” this new method looked agnostically at massive amounts of data over time to figure what voters needed to be messaged in order to influence them.
Where polling would take over a week to come up with a pre-defined question like, “What’s your opinion on immigration?” and then attempt to slice and dice it by pre-defined voting groups, Big Data took a different approach. Instead of a sample poll, this new method would harvest data from Facebook and other social media sites in real time as well as any database they could get a hold of. Then without any pre-defined opinions, it would look at correlations in trends of behaviors and find that people in a targeted area have a higher probability of being an “apathetic traditionalist.” Instantly, the messaging goes out against President Obama’s executive orders on immigration and sanctuary cities. In addition, using the data insights and likelihood, it shows the message should be with a tone of confidence and firmness, and I guess adding with ramblings and a recommendation for the word “huge.”
Second, when it comes to politicians and statistics, they both lie.
Most people have a natural confirmation bias and interpret events and actions not in terms of numbers that have a certain probability, but as evidence confirming one’s worldview. The pollsters (and media) keep presenting absolute results in favor of a good story. And the biggest issue is that everyone seems to be wrong all the time.
It is not that the polls are always wrong, but the main issue is that they are misrepresented, misunderstood and usually late. People want a winner and loser or a single answer, and struggle with probabilities and the chance that their answer may be one of many. For them, it is politics and statistics as usual. Quoting Viktor Mayer-Schonberger and Kenneth Cukier from the book “Big Data: A Revolution That Will Transform How We Live, Work, and Think,” “The obsession with exactness is an artifact of the information-deprived analog era.”
This is new territory and may require a new way of thinking and looking at polls. Big Data analytics has had clear wins in the fields of politics and business. It promises to provide insights for those who seek patterns from vast data banks. The biggest thing it has shown is that there may be a new way of looking at things — a messier, less exact but more realistic view of the future. A view that if harnessed correctly by some, can create or shape their destiny instead of having to just accept it.
• Eric Wilson is executive director of the Kentucky 9/12 Project and co-author of “We Surround Them: Our Journey From Apathy to Action” (Georgetown Grassroots, 2011).