I finally wrote about the presentation I gave at the Opta Pro 2018 Forum....
I wrote up the poster presentation I gave at the 2017 Opta Pro Forum for the Opta Pro blog looking at using machine learning to quantify footballer's decisions....
My Twitter feed seems to be increasingly taken up with discussions of Expected Goals in football yet there always seems to be something important missing from the discussion, and that's uncertainty...
My last article on expected goals introduced the concept of using exponential decay to estimate the probability of scoring based on the shooter’s distance from the goal. The article received lots of feedback (thanks everyone!!), with a couple of common comments standing out that I wanted to address.
In my last article on expected goals I showed how to incorporate the distance from goal along the Y axis into the expected goal model using Pythagoras’ Theorem. This all worked pretty well, giving us an r squared value of 0.95. However, while the r squared value was good there was still a flaw in the model we need to fix.
Expected goals are one of the hot topics in the football analytics community at the moment and it’s a topic I’ve previously written a number of articles on discussing how to calculate them. If you haven’t read those pieces yet it’s probably worth taking a quick look to set the context for the rest of this article.
When I introduced my Expected Goals model a few weeks back a number of people commented on the bump in the curve where I had included penalty shots in the data set used to fit the model...
Since my last post about how to calculate expected goals one question has come up more than any other and that is about the correlation between expected goals and actual goals..
It seems that everybody has their own expected goals models for football nowadays but they all seem to be top secret and all appear to give different results so I thought I post a quick example of one technique here to try and stimulate a bit of chat about the best way to model them.