The Power Of Goals recently blogged about using the Poisson distribution to predict the outcome of football matches. I have been evaluating the predictive ability of the Poisson for the English Premier League (EPL) this season so I thought I would share my experiences too.
For anyone who is unaware, the number of goals scored by each team in a football match roughly follows a Poisson distribution. As you can see in Figure 1, it is not exact though as the Poisson distribution underestimates the likelihood of no goals being scored and overestimates one, two and three goals being scored. By four goals and upwards the Poisson starts to underestimate again. The actual difference between the Poisson and what is observed in the EPL is reasonably small though so it just requires a small fudge factor to bring the two into line.
Figure 1: Poisson Distribution vs Observed
To carry out the predictions I have written a script in R that scrapes the Premier League table directly from the BBC’s website. The script then calculates attack and defence coefficients for each team by comparing their goals scored and conceded with the overall EPL average home and away. The predicted number of goals scored in a particular match can then be calculated by scaling the EPL’s average goals by the two team’s attack and defence coefficients. This can then be mapped to the Poisson distribution to generate a probability matrix for each particular score line (Table 1). From this, the probabilities can be summed to find the odds that each match will end as a home win, draw or away win.
Goals | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
0 | 1.96 | 4.08 | 4.24 | 2.94 | 1.53 | 0.64 | 0.22 | 0.07 | 0.02 |
1 | 3.63 | 7.56 | 7.86 | 5.45 | 2.83 | 1.18 | 0.41 | 0.12 | 0.03 |
2 | 3.36 | 7.00 | 7.27 | 5.04 | 2.62 | 1.09 | 0.38 | 0.11 | 0.03 |
3 | 2.08 | 4.32 | 4.49 | 3.11 | 1.62 | 0.67 | 0.23 | 0.07 | 0.02 |
4 | 0.96 | 2.00 | 2.08 | 1.44 | 0.75 | 0.31 | 0.11 | 0.03 | 0.01 |
5 | 0.36 | 0.74 | 0.77 | 0.53 | 0.28 | 0.12 | 0.04 | 0.01 | 0.00 |
6 | 0.11 | 0.23 | 0.24 | 0.16 | 0.09 | 0.04 | 0.01 | 0.00 | 0.00 |
7 | 0.03 | 0.06 | 0.06 | 0.04 | 0.02 | 0.01 | 0.00 | 0.00 | 0.00 |
8 | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 |
Table 1: Example Goal Probabilities (%)
Since the predictions are based on past performance this season, I waited until week five of the EPL to start testing it so I had at least a month’s worth of previous results to work with. The first week went well, with the model correctly predicting the outcome of six of the ten matches that weekend. Table 2 shows the predicted probabilities (%) of the home team winning each match. From this I also calculated the odds and compared mine with those available from Betfair to see how well they compared.
Home | Away | Prediction | Probability (%) | Odds | Betfair | Result |
Swansea | Everton | HOME | 56.3 | 1.78 | 3.35 | AWAY |
Chelsea | Stoke City | HOME | 63.4 | 1.58 | 1.39 | HOME |
Southampton | Aston Villa | AWAY | 49.2 | 2.03 | 3.1 | HOME |
West Brom | Reading | HOME | 41.1 | 2.43 | 1.82 | HOME |
West Ham | Sunderland | HOME | 35.7 | 2.80 | 2.24 | DRAW |
Wigan | Fulham | AWAY | 40.1 | 2.49 | 3.25 | AWAY |
Liverpool | Man Utd | AWAY | 75.6 | 1.32 | 2.82 | AWAY |
Newcastle | Norwich | HOME | 82.9 | 1.21 | 1.84 | HOME |
Man City | Arsenal | AWAY | 37.1 | 2.70 | 1.78 | DRAW |
Tottenham | QPR | HOME | 41.1 | 2.43 | 1.51 | HOME |
Table 2: EPL Week 5 Predictions
Since then, the performance of the Poisson has between correctly predicting between 30-60% of matches each week (Figure 2). So far, the average accuracy is 46%, which is slightly higher than the 33% we could expect from randomly guessing each result.
Figure 2: Weekly Performance of Poisson Predictive Model
I am hopeful the model’s success rate will improve over the course of the season as it gets more data to work with. There are also further improvements that can be made as well. For example, the model currently considers the goals scored by each team to be independent events. However, it may be that the two should be correlated together as it would seem intuitive that the more goals one teams scores the less likely the opposition is to score. At the moment though I wouldn’t place too much faith in the Poisson model.
richard vadoret - August 13, 2013
Hi Martin ,
your graph in figure 1 show that your poisson distribution can predict Under/Over 2.5 final soccer score with good accuracy. isn’t it ?
thanks
richard
Adarsh - October 11, 2014
Try poisson distribution on France Ligue 2… quite accurate…
Martin Eastwood - October 15, 2014
Cool, will take a look!
Submit your comments below, and feel free to format them using MarkDown if you want. Comments typically take upto 24 hours to appear on the site and be answered so please be patient.
Thanks!