Okay, before we get going on Matchday 2, let’s review how things went in MD1 for our prediction model from Charlie and Mitchel:
Matches like Villarreal’s from last week do their model no favors. Villarreal had a 49% likelihood to win, they won the xG battle 1.1 to 0.3, and they just never quite found their breakthrough so the match ended in a draw. Technically, that’s a loss for the model, but from my perspective the model accurately predicted how the game was going to go. Granada never looked like winners in that one and Villarreal played well enough to be significant favorites.
It’s also worth noting that football is, to a degree, a game of chance. A single ball could hit a post or take a deflection and change a whole match. So, for me, when I look at their model and see a result go draw I don’t automatically count it as a loss for the model. I see their result predictions last week more the way we’d see a football team’s record: 4 wins- 5 draws- 1 loss (Valencia beating Getafe). Out of those five draws, where their model predicted one team to win as the most likely result and they drew, the ‘most likely scoreline’ predicted two of them as draws anyway.
So in general, I feel like the model painted a very solid picture of how matchday one was going to go. Only five teams won last week, they picked four of them, and only one team they though most likely to win actually lost- that being the smallest percentage difference prediction their model made.
The week one model was based on data from last season, and struggled to account for player additions and departures. As the season progresses, the model will lean heavier and heavier on data from this season, which should give us more accurate results. There’s also a home field advantage component that wasn’t present in week one that will continue to develop as the year goes along, and that is reflected in what you see below.
Once again, Villarreal is a favorite, though this time by a narrower margin, as we go on the road to face Espanyol. At some point we’ll need to get Charlie on a podcast or something to talk about these ‘most likely scorelines’ because for the second week in a row the goal totals are just unrealistically low. I look forward to seeing how he and Mitchel address that as the season goes on. You’ll notice the home field component making a difference immediately. Last week, only three home teams were favored to win as the most likely result. This time around six different home teams are favored to win, and only Real Madrid, Barcelona, and Sevilla are notably heavy road favorites.
As ever, give Charlie (@analyticsLaLiga) and Mitchel (@msocanalytics) follows on Twitter to see all the great work they do with football and data. To see how their prediction model works, click here.