I am a data scientist with a love of numbers so what better combination than football and statistics? Pena.lt/y is a place for me to write down my thoughts on football and its mathematical analysis. I don’t want it to just be about me though, so please let me know what you think and join the discussion!
I'm a bit late to the party and most of the interesting web domains have already been taken so I ended up using a domain hack to try and spell the word penalty – pena.lt/y.
Yes, the blog is intended to be educational and hopefully the articles here can help generate further discussion about how best to analyse football so I am all for sharing information. There are a few minor conditions though:
There is a huge amount of data recorded for professional football but sadly a lot of it is not freely available or easy to access. Because of this, most of the data I use is scraped from various websites across the internet and painstakingly joined together. Unfortunately, this means that I cannot share the data I collect as it would probably infringe somebody's copyright somewhere, or at least annoy them...
There is probably a very long blog to post to be written at some point about the data science stack I use but the succinct version is R, Python, PostGreSQL, MongoDB and Amazon Web Services.
You are welcome to get in touch but I warn you in advance that I'm very busy!
Maybe, get in touch and we can discuss your ideas further to see how feasible they are.