Editor’s note: As the only SB Nation site that covers a La Liga team that wasn’t invited to the European Super League, I have felt a certain obligation to represent the ‘other’ clubs in Spain as a whole. This is why you see interviews on the site with fans from clubs like Huesca, Betis, Granada, Real Valladolid, and Celta Vigo. In that spirit I want to introduce you to our newest feature: model based predictions for every La Liga match.
To do this, we are using a model developed by the Michigan Soccer Analytics Society, specifically Mitchel Green, with the model being adapted for La Liga by Charlie Tuley, who has written for our site before. This model will update every week and go stronger as the season feeds it more data. Below, Mitchel and Charlie describe how the model works, and at the bottom you can see this weeks predictions.
How the model works
The new club season is upon us this weekend, and with it comes lots of questions. How will Barcelona perform without Messi? Will Atletico Madrid be able to retain their title? How will the promoted teams perform? While the answers to these questions will reveal themselves over the course of the next nine months, is there a way to be ahead of the curve in predicting the answers to these questions? With the match prediction model I’ve created, there might be.
I created this match prediction model last January and used it to try and predict results from the Premier League, Bundesliga, and MLS (you can see weekly updates for those leagues on my Twitter @msocanalytics). It has performed fairly well so far, hitting the 48%-51% range in the leagues I’ve looked at. This season, I’m branching into La Liga with the help of Charlie Tuley (@analyticslaliga on Twitter). He’s much more knowledgeable on the league than I am, and he’ll be able to help me tweak the model throughout the season.
The model is calculated on a weekly basis, so at the moment we unfortunately won’t be able to see how it thinks the entire season will go, but we will be able to see teams that are trending up and down. We’re starting out the season based on last year’s expected goals data taken from fbref.com. One issue with this early on will be that certain teams will have coaching and personnel changes that the model has not caught onto yet. Another is that we don’t know how the promoted teams’ expected goals tallies will translate - I’ve tried to figure that out based on previous seasons for promoted teams but our data only goes back a couple of years.
These problems shouldn’t be an issue a few weeks into the season, but it could throw up some wacky projections early on - see Espanyol as strong favorites away to Osasuna. You’ll also notice a lot of draws as projected scorelines. This likely stems from the average non-penalty expected goals per 90 of a team in La Liga last season was about 1.07, which means the league as a whole was very solid defensively (or bad offensively, you decide which).
As of right now, the model uses expected goals on a five week average to project the next week’s results. I’ve found that this has been working well at capturing a team’s short term form, and it performs fairly well over the long term. By using a poisson distribution model, we use each team’s expected goals/goals against for the week to determine the chance of each result. Over the course of the season we may implement new factors as we get more data - such as finishing quality, home field advantage, and opposition adjustments. But for now the model will stay simple.
This week’s predictions:
Editor’s note: Mitchel and Charlie are off to a good start, predicting a Villarreal win to start the season!