The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

Frances Whiting

Frances Whiting is an Associate Researcher at CMB.

Recent Posts

And the award goes to… Predictive Analytics!

Posted by Frances Whiting

Wed, Feb 22, 2017


It doesn’t take a data scientist to predict some of what will happen at Sunday’s Oscars—beautiful people will wear expensive clothes, there will be a handful of bad jokes, a few awkward speeches, and most likely some tearful and touching ones as well. But in terms of the actual award outcomes, well, that takes a bit more analysis, and as quick search suggests, there’s no shortage of that online.   

These predictions come at an interesting time in the context of recent world events. In 2016 a few world events shook the predictive analytics world (and beyond) with outcomes so unexpected that even the most respected pollsters failed to predict them. And while many of the unanticipated polling outcomes occurred within politics (think Brexit and the U.S. presidential election), the implications for predictive analytics are also relevant to the market research industry.

As CMB’s president and co-founder, Anne Bailey Berman, recently said in Research Business Report’s Predictions issue, “ the market research industry will face many of the same questions regarding surveys and predictive analytics that are facing pollsters and data scientists in the aftermath of the election.”

Let’s bring it back to Sunday's Academy Awards. Since people love to predict the winners of awards like “Best Picture” and “Best Actress,” the awards show offers pollsters a chance to reflect on what went wrong in 2016 and to test refined predictive models in a much lower stakes context than a presidential election.

For example, popular polling site FiveThirtyEight has an ongoing tracker for Oscar winners. Typically, FiveThirtyEight bases its Oscar prediction model on the outcome of guild and press prizes that precede the Academy Awards. FiveThirtyEight watches who wins these other awards, like the Golden Globes or the Screen Actors Guild Awards, and then tries to figure out how much (how predictive) those awards matter.

First they look at historical data and pull all guild/press winners from the last 25 years, assuming these winners are representative of the Academy’s thinking. Based on the percentage of those awards that actually went to the corresponding Oscar, they assign a certain score for each award (e.g., if 17 of the last 25 winners for the Academy Award for best supporting actor also won the Globe, there’s a 68% correlation between the two).

Then they turn each award percentage into a score by squaring the original fraction and multiplying by 100. In doing this, weak scores get weaker and strong scores stay strong. FiveThirtyEight pollsters then consider other factors, like if the award is voted on by people who are also part of the Academy Award electorate or if the nominee loses. Both factors impact each prize’s “score”.

After reviewing FiveThirtyEight’s predictive modeling I've learned that even low-stakes polling for events like award shows depends on historical voting patterns and past outcomes. But is there danger in relying too much on historical data? If there’s one thing the 2016 US presidential election taught us it’s that predictive models can be susceptible to failure when they place too much weight on historic outcomes and trends. [twitter-129.pngTweet this]

The main problem with the predictive polls in 2016 was that they weren’t fully representative of the actual voting population. Unlike previous elections, there were A LOT of voters who turned out to cast their ballot on Election Day who predictive polls had missed throughout the campaign. Ultimately the polls failed to accurately predict the actions of these “anonymous voters,” perhaps in large part because they failed to account for the changing cultural, demographic, and economic social contexts impacting peoples’ decisions today. But that’s an exploration from another time. The point is, the 2016 predictive polls–based largely on historical trends–misrepresented the actual voting population.

Similar to the actual 2016 voting population, Academy members who vote on the Oscars are generally anonymous and can't be polled in advance of the event. This anonymity  forces pollsters to get creative and base their predictive models on a combination of historical guild and press prize outcomes. As market researchers and political pollsters know, even if voters are polled before the vote, there’s no guarantee they will actually act accordingly.  

This leaves us researchers with a serious conundrum: how can we get into anonymous respondents’ heads and predict their actual decisions/voting behaviors without relying too much on historical data?

One solution might be to emphasize behavioral datainformation gathered from consumers’ actual commercial behaviors–over their stated preferences and beliefs. For Oscar predictions, behavioral data might include:

  • Compiling social media mentions and search volume (via Google or Bing) for particular movies, actors, actresses, directors, etc.
  • Considering the number of social media followers nominees have and levels of online engagement
  • Tracking box office sales, movie downloads, and movie reviews

Based on the surprising outcome of the 2016 presidential election and Brexit, we learned that there was a huge cohort of unaccounted voters–voters who indeed turned out on voting day–that skewed traditional predictive models.

If pollsters hadn’t relied solely on historical data, and instead used an integrated approach that included current behavioral data, perhaps the predictions would have been more successful. There were plenty of voters on all sides who voiced their opinions on traditional and untraditional platforms, and capturing and accounting for those myriad of voices was a missed opportunity for pollsters.

Though the Oscars are a much lower stakes scenario, hopefully researchers continue to learn from 2016 and expand their modeling practices to include a combination of measures. Instead of a singular approach, researchers should consider combining historical trends and current behavioral data.

Interested in learning more about predictive analytics? Check out Dr. Jay’s recent blog post on trusting predictive models after the 2016 election.

 Frances Whiting is an Associate Researcher at CMB who is looking forward to watching the 89th Academy Awards and the opportunity to try her hand at predictive analytics!

Topics: television, predictive analytics, Election