What the heck do pass attempts tell us, anyway?

By Jared Young (@jaredeyoung)

A long time ago, in a galaxy far far away, a sport called baseball ruled American sports universe. I was obsessed with the sport myself. Every day I sat at the kitchen table and poured over the box scores in the daily paper. They were the best way of connecting to my beloved Pirates. So today, as MLS and USMNT games role in, I check out the box scores on MLSSoccer.com and Whoscored. It's a way to get a sense of the game, to connect with a game I didn't see. But what actually are these box scores telling me? As soccer analytics gets more and more reliant on data scientists for insights, I feel like the box scores are still a bit of a mystery. For example, total passes attempted - what do they tell us about a game? One MLS game this year had just 660 passes attempted combined, while another had 1,189. What should I know about such a big swing. Does it matter? Those games probably look dramatically different and my guess is fans would want to see more passing, not less. Wanting to know more about soccer box scores I decided to find out what lies behind the statistic that is total passes.

First here is a distribution of passes attempted so far this year in MLS.

x-axis = total passes in a game. y-axis = count of games played in that range in 2017.

x-axis = total passes in a game. y-axis = count of games played in that range in 2017.

This distribution would make Carl Friedrich Gauss proud. The most likely outcome is for a game to have between 900 and 949 passes attempted by both teams. Now let's start with a simple hypothesis that games with more passes occur because teams attempt easier passes, either by choice or due to a lack of defensive pressure. Here's the passing accuracy by game plotted with total passes.

Not surprisingly the higher the pass completion rate the more passes that are attempted.  Now let's look at what types of passes are being attempted that result in a higher probability of success. Looking at the total distance of each pass doesn't reveal much but focusing on just vertical distance is interesting. 

The above reveals that the average vertical distance (i.e. how much closer the pass got to the opponent's goalline) of the passes has a fairly big impact on pass completion rate, and that does translate when looking at the key metric, total passes.

Fifty percent of the variance in passing attempts can be described simply by how direct the passes were. We know direct passes are more difficult but why would that substantially lower the volume of passes overall? The answer lies in the fact that direct passing results in more restarts. Restarts are passes that follow either the ball going out of bounds or a foul. Restarts are therefore goal kicks, free kicks, corner kicks or throw ins. 

The above plot shows the dramatic impact that restarts have on the number of passes in a game. In low passing games, one in seven passes follows a restart. In high passing games that number drops to one in fifteen. The reason this is so significant is because that the average time between kicks when the second kick is a restart is 32 seconds. 

Let’s put that into context. The fewest restarts in a game in MLS this year was 60. The most was 116. The difference in dead ball time between those two games is almost THIRTY MINUTES!  That’s not only a big difference in the amount of passes be made, that’s big difference in entertainment.

So now, at least I have learned that total passes is an indirect indication of how direct the passing was in the game. The fewer the passes the more direct the passing in the game. The more passes the less direct the game. The reason that direct passing results in fewer passes is because direct passing leads to more balls out of bounds and more free kicks.

If you combine restarts and vertical passing distance in an attempt to predict total passes, unfortunately we don’t get too much closer to solving the puzzle. The Rsquared of that little model is just .51, barely above what we knew about the impact of direct passing alone. This is because direct passing and restarts are so intimately connected.

Two final questions. The first is whether or not direct passing is a result of the more aggressive defense or if that is just the style of the offensive team. Defense does not appear to be forcing a style of play. There is no relationship between the amount of defensive actions (even at different parts of the field) and vertical passing, total passing, or restarts.

Lastly, a question on your mind might be whether or not total passes are related to scoring. Does more passing equal more scoring? The answer is no. In fact, the correlation could not be any closer to zero. Looking at total passes tells you nothing about the final score or how aggressive the defense was, but it can tell you about the style of passes and how often the ball was dead. And subjectively speaking it might tell you if you would have enjoyed watching the game in person. May you now attack those box scores with awareness and fervor!