And the winner is!

Michael Wagstaff • 21 December 2020

How we did using sentiment analysis to predict Strictly outcomes


Strictly Come Dancing ended on Saturday with the grand final. We used sentiment analysis to predict that Bill Bailey would win the Glitterball, which he did. Champagne all round.


We also tweeted an updated sentiment score after each round of dancing which showed Bill stretching his lead as the show progressed.


But it wasn't just for the final that we used sentiment analysis on Strictly. Over the last eight weeks we have analysed tweets to predict the outcome of each show. So how did we do?

How accurate is sentiment analysis?

Overall, we think our Strictly Sentiment model performed really well. We got 13 out of 15 measurable outcomes correct - an 87% success rate has got to be very good in anyone's book.


Sentiment analysis is the process of determining whether some text is positive, negative or neutral in tone. To extract sentiment from tweets we built a supervised machine learning based sentiment model. To train the model we manually labelled thousands of Strictly tweets from the launch show as negative, neutral or positive and fed these into it. We paid particular attention to teaching the machine how to deal with nuance and ambiguity. For example, the sentiment behind the word 'poor' in this sentence: 'Poor Jacqui having to dance with Anton', is very different to: 'That dance was poor Jacqui'.


Our work was successful as the 87% success rate demonstrates.


Each week we grabbed thousands of tweets about each contestant. When then ranked them on sentiment and gave points depending on their rank. This is the same method that the BBC uses to give each contestant a score based on the public vote. We then combined the ranked sentiment score with the judges score to predict which two contestants will be in the dance off. The prediction was tweeted one hour before the results show on Sunday. The exception to this was for the semi-final when we tweeted the prediction a few minutes after the show. For the semi-final we moved to analysing the tweets in real time in preparation for the final.


Out of 14 dance off places we got 12 right. The two times we got it wrong was when we put JJ Chalmers in the dance off. We put this down to JJ having a significant offline following. This was the one weakness of the model.


For the final, our analysis predicted that Bill would be the winner. We tweeted this prediction just after the public vote closed and 10 minutes before the live announcement on tv. Obviously, we were really pleased when Bill was announced the winner but doubly pleased when we saw that two online viewer polls had HRVY ahead of Bill.


Our sentiment model had outperformed an established online polling method. Fab-u-lous!


Highs and lows

There have been a lot of high points for us:


  • Correctly predicting 13 out of 15 outcomes.
  • Building a sentiment model that is highly accurate and that outperformed established online polls.
  • The model was sensitive to movements in popularity - Maisie going low then high, Ranvir going in reverse.
  • Undertaking analysis in real time was a definite squeaky bum time win.
  • Seeing traffic to our website sky-rocket at the weekend and when the weekly post analysing the results was published.


But it's not all been good:


  • The model was less effective for contestants whose following was more offline than online.
  • People tweeting about the show were younger than those who watch it (although the audience data on which we made this comparison in a previous post is a few years out of date).


Can sentiment and text analysis be used more in market research?

Yes it can.


The most obvious example is analysing customer feedback, either using a brand's own feedback channels, open-ended survey responses or utilising social media. Sentiment analysis can be used to keep track of how a brand is performing with the public or potential buyers. Topic analysis (making sense of unstructered text) can be used to understand what consumers are talking about. Gaining valuable Voice of the Customer insights from the application of sentiment and text analysis is a clear business advantage for brands.


We use sentiment analysis in combination with topic analysis to identify themes and subjects from large unstructured data sets. We have posted quite a few examples of topic analysis in the Scribbles section of our website where we gauge public opinion on events and issues. For example, we produced an analysis of the nation's mental health during lockdown restrictions by analysing 15,000 tweets.


We also undertake sentiment and topic analysis to analyse ratings and review sites. We do this at brand and product level. We scrape sites such as Trust Pilot and Amazon to identify strengths and weaknesses of product offerings and those for rival brands. This analysis can also feed directly into new product development by identifying gaps in product features (compared with rivals) and opportunities in the market based on unmet customer needs.


Strictly Sentiment leaderboard

Our final Strictly Sentiment leaderboard showed a win for Bill. He really does love to boogie on a Saturday night.



And so do we.



Strictly Sentiment score


The Strictly Sentiment score is derived from a sentiment analysis of tweets. Using natural language programming and machine learning we classify each tweet as positive, neutral or negative. We only collect tweets made during the show.


We then add up all the positive tweets for each contestant and assign a score between 1 and 100.  This score is based on the relative distribution of positive tweets. We do this to make it easier to compare and contrast Strictly Sentiment scores.

by Michael Wagstaff 8 April 2025
The huge volume of data available through consumer comments, reviews and surveys can make cutting through the noise difficult. In this article we discuss how text analytics combined with human expertise can uncover the insight.
by Michael Wagstaff 10 March 2025
Market research agencies are going all in on AI based models to generate next level consumer insight. But are these just more illusion than substance?
by Michael Wagstaff 3 March 2025
With the online survey already on the ropes due to poor quality, has data science finished it off?
by Michael Wagstaff 24 February 2025
Research agencies are pinning their futures on AI. Are they right to do so or are we missing trick by ditching the human?
by Michael Wagstaff 12 February 2025
Online surveys suffer from fake, ill considered and unrepresentative responses. What can be done to improve their reliability? Triangulation is the key.
by Michael Wagstaff 11 February 2025
With so many agency panels riddled with fake respondents resulting in poor quality data, are we witnessing the end of the online survey?
by Michael Wagstaff 6 February 2025
With the January transfer window closed, we run our predictive model to work out the probabilities of where each team will finish in the final Premier League table.
by Michael Wagstaff 5 February 2025
In this latest article in our series on the power of text analytics we look at how sentiment analysis can be used to really understand what customers think about product offerings.
by Michael Wagstaff 23 January 2025
The true value of review sites lies in going beyond the stars and analysing what reviewers are actually saying.
by Michael Wagstaff 17 January 2025
Start making sense of customer review and feedback data by using text analytics. In the first of a series of articles we discuss how it can help your business.
Show More