Does data do what we think it does?


How clever do you need to be to work in football analytics?

Some recent twitter threads discussed the fact Liverpool employ numerous Harvard and Oxbridge educated people within their analytics department. Most advertised jobs now want at least a first class degree in data science (but don't expect more than £18k a year) and the Opta Pro lineup features some of the brightest young minds doing incredibly complex things.

Don't worry, my learned reader, this won't be an anti-intellectual rant. I love football data and I think it can be useful, though most is just fun. Fun is good though, this isn't a criticism.

I also think we are already at or near the limits as to what can be usefully done with the publicly available data.

Decontextualised numbers - like how many tackles a midfielder makes, or pass completion rates - are nice to discuss but ultimately irrelevant. If Gana is winning the ball back a massive amount it tells me both that he is very good at it, but also that Everton have a gameplan which sees a lot of changes of possession in the middle of the field. If Everton had a number 10 who held on to the ball more I've no doubt Gana would drop down the tackling league table. But that wouldn't mean he was worse.

Better are stats like passes to the final third or deep completions but even those need context. Some deep midfielders have a tactical job to progress the ball forward (like Fred at Shakhtar), some teams use their pivot like Carvalho for Portugal where he tends to play lots of his passes out to the fullbacks who then progress the ball up field.

That doesn't mean that Fred is better than William Carvalho, he just has a different function.

I think this is well understood by people who read blogs like this.

These type of single figure stats, along with lots of other useful graphs, data etc are available commercially.

So if this is available commercially then surely all these PhD must be working on things that aren't commercially available.

Like what though? I'm presuming, given the intellectual calibre of the people being recruited, is that it is something us norms wouldn't be able to do. Presumably big data research projects. Perhaps spatial modelling to help players anticipate where gaps will appear? Vision tracking to see if increasing the amount midfielders look about makes them better?

Given you can probably recruit a top quality team of researchers for the cost of a third choice goalkeeper this is probably a good investment for the really rich clubs. If their findings help 1 more youth team prospect reach the top 0.0001% of players (rather than one of those useless 0.001% of players who play Sunday League) then it can be worth tens of millions to the club.

The question will always be how transferable is this type of cutting edge academic research to the actual field of play?

If our cutting edge research analysing every shot ever taken shows us long shots are generally bad what do we do with that information? We can train more on creating shots with a higher xG, feed back to the players that speculative shots are bad? But what if we have an excellent long range shooter on our books? How do we know he is an excellent long range shooter? Does his presence change the advice? Can we ever really advise on these things or should we concentrate more on creating a team where players trust their instinct?

I'm conflicted on this. I love data, but (complicated sentence alert) do we have the data that shows that it is the actual data analysis leading to better decisions being made rather than the mindset that makes data part of the discussion?

I reckon simply the process of thinking carefully about decisions is probably where most of the value is. But that by involving data people in the discussions you are including people who, by their very nature, like to think things through and justify them.

But I don't have the data to prove it.






Comments

Popular posts from this blog

Wyscout review and poking around the French third tier

Scouting report Dan Ndoye - Lausanne Sport

Data Analytics conference - Daniel Krueger report