Analysing Football Teams Using Cluster Analysis and Principal Component Analysis

Posted by Martin Eastwood August 30, 2013 3 Comments 1756 views

The amount of football data available is growing rapidly – with every passing week of the season more matches are played and even more data gets collected. This is great as it allows us to increase our understanding of the game but it also means we quickly end up with more information than could ever be analysed manually.

Instead, we can use techniques such as cluster analysis and principal component analysis (PCA) to critically analyse these large sets of football data to identify important patterns and relationships that can help explain a team’s performances.

Find out more by reading the full article on the Onside Analysis blog here.

About Martin Eastwood

Martin is football fan and data scientist. In his spare time he likes to combine the two and write about the mathematical analysis of football.

View all post by Martin Eastwood

There are 3 Comments

  1. - September 14, 2013
      -   Reply

    Hi Martin,

    First of all, great blog!
    I was wondering how you computed the probabilites by the bookie to determine the PRS? Let’s say the are 4 3 1 for home win, draw and away win. Did you compute the probabilitie home win as 1/4 or as 4/9 to insure the sum over the probabilities equals one? And why?

    Humphrey

  2. - September 15, 2013
      -   Reply

    Hi Martin,

    I mean of course (1/4)/(1/4+1/3+1/1) to ensure the sum over the probabilities equals one?

    Humphrey

    • Martin Eastwood
      - September 15, 2013
        -   Reply

      Hi Humphrey – the simplist way to get bookies odds to add up to one is add them all together as decimals and then divide the individual home, draw, away odds by that value to rescale them.

      The idea is that it removes the over round but you are of course presuming that the over round is applied equally to the three outcomes, which may or may not be the case.

Write Your Comment

  • Martin Eastwood

    It's the average rps of all the forecasts so indiv …

  • Thomas

    Sorry the comment was indented for post: http://p …

  • Thomas

    Hi Martin, Why are your values so constant afte …

  • Martin Eastwood

    It took quite a bit of effort! …

  • abhinav

    how are you gathering xy coordinates from squawka? …

Latest Tweets

  • Loading...