I've long been critical of the use of radar plots for visualizing football data and was recently challenged by a reader of this blog to come up with a better alternative so here we go.
First of all, why are radar plots so bad?
Well, as Luke Bornn's recent tweet so excellently shows, the shape of the plot is controlled by the order of the variables - swap a few of them around and suddenly the shape you are looking at is completely different, making visual comparisons difficult and prone to error. Essentially, the shape you are looking at and judging is meaningless as it's determined purely by the author's choice of layout.
A reminder, blatantly plagiarized from @stat_sam, of why radar plots are misleading. Eye focuses on area, not length. pic.twitter.com/Dk3gcn1GD1
— Luke Bornn (@LukeBornn) 17 May 2017
Another major issue is that the area of the radar's shape does not increase linearly. If you double the size of a value on a bar chart then the bar representing it doubles in area too - this is a good thing as it means we can compare data easily.
However, double the size of your values in a radar plot and the area increases by the square of the values instead. This magnifies small differences, making them appear much bigger than they really are and distorting any visual comparisons we make. This is a bad thing, a very bad thing.
Neil Charles showed some player comparisons at the Opta Pro Forum a couple of years back. These were based on bullet graphs but with the central bar replaced with a strip plot to try and highlight the distribution of the data.
Gylfi, because he's topical. First four boxes raise serious questions (which others inc. @mixedknuts have answered in detail) #Swans #EFC pic.twitter.com/yzowruuywW
— Neil Charles (@neilcharles_uk) July 13, 2017
I remember being really impressed with these at the time but always felt that the central strip plot didn't quite work as it was one-dimensional and the overlap of the points obscured the true distribution of the data. If you look at the example above you can see it all kind of merges into one grey line down the centre of the plot.
Another idea that has been gaining a little bit of traction recently are Joyplots. These are great as they show the distribution of the data really clearly provided they are plotted at a reasonable size. However, as soon as you shrink them to fit few onto a page or overlap them then it obscures the data making visual comparisons tricky.
Crosby, McDavid, and the Greatest Player in the World... pic.twitter.com/VbZAm372ck
— Ryan Stimson (@RK_Stimp) July 14, 2017
After much consideration I settled on going with a swarm plot. The swarm plot takes the central strip used by Neil Charles but instead of flattening it into a single dimension where points can obstruct each other, it applies just enough jitter to each point to separate them and prevent any overlaps.
This can create a chart that looks a little like a swarm of bees, hence the name, but importantly it also means that you can clearly see the distribution of the underlying data without any overlapping points obscuring each other.
I kept Neil Charles' approach of coloring the data by quintiles so you can quickly see what group the player falls within. For example, if a player is in the red section then they are in the bottom 20% of players for that particular metric (bad), if they are in the yellow then they are in the middle 20% (average) and if they are dark green then they are in the top 20% of players (yay!).
I've also added a player rating in too, which is just the average of all the quintiles. I'm not fully convinced of the merits of this but I've shown this style of chart to a few people and pretty much everybody asked what the player's average was so I've included it on the chart.
As ever, let me know what you think.
Submit your comments below, and feel free to format them using MarkDown if you want. Comments typically take upto 24 hours to appear on the site and be answered so please be patient.
Thanks!