The Next Level of xG: Expected Possession Goals

By Cheuk Hei Ho (@tacticsplatform), Eliot McKinley (@etmckinley), and Jamon Moore (@jmoorequakes)

Using xPG variants to assess risk-and-reward of the game

We introduced Expected Possession Goals (xPG) in two recent articles. xPG groups and rates the outcome of a possession and began from an idea that every action in the possession connects to create a shot. Here, we’re introducing new xPG variants, extensions to the original xPG definition to assess the risks and rewards inherent in a soccer possession.

xPG rates a group of uninterrupted events - or when an interruption lasts fewer than two seconds - based on where the ball travels. It assumes the purpose of the possession is to move the ball within shooting distance.

For xPG, we use a non-shot xG (NSxG) model that divides a pitch into 162 zones (zNSxG). Each zone is assigned a value based upon the mean xG of MLS shots taken within that zone since the start of the ASA era. This forms the Positive NSxG map. Additionally, we invert the map, negate all non-zero values and create the Negative NSxG map. Each action in a game (e.g. passes, dribbles, shots) is assigned a Positive and Negative NSxG value based upon the zone where the action started. These two maps allow us to assign weighted values to all actions for the team possessing the ball- not just shots for each team in the game. Put another way, the Positive non-shot xG is the value of the action to the offensive team, and the negative non-shot xG is the potential cost of the action if the ball was turned over to the defensive team.

Note: zones containing 0.000 indicate that there are no shots from that zone in the ASA data set and thereby NSxG actions in that zone will receive no value, because the probability of a goal from that position for the attacking team (Positive NSxG)…

Note: zones containing 0.000 indicate that there are no shots from that zone in the ASA data set and thereby NSxG actions in that zone will receive no value, because the probability of a goal from that position for the attacking team (Positive NSxG) or, should a turnover occur, for the defending team (Negative NSxG), is considered to be zero.

We developed Positive xPG and Negative xPG because a possession generally serves two key purposes:

  1. when the possessing team advances the ball, its probability to score – or its Positive xPG – increases.
  2. at the same time, as the possessing team moves the ball away from its own goal, the risk of a turnover to immediately result in a shot for the other team decreases. This is expressed in Negative xPG.

Let’s take a look at an example possession.

actions1.png

The diagram above shows the same possession on both Positive NSxG (scoring probability) and Negative NSxG (risk) maps beginning with a goal kick (successful pass) out to the right back (1), a successful pass to the defensive midfielder (2), a successful dribble by the DM past a defender (3) and then an unsuccessful pass to the right midfielder (4). Actions (1), (2), and (3) add up to 0.009 Positive xPG and -0.197 Negative xPG for the team. Action (4) gets no value since it was unsuccessful.

In this case, the possession did not result in any shot, and Expected Goals (xG) would not assign the possession any value. The same goes for Expected Goal Chain (xGChain) and Expected Buildup (xB), which are also derived from xG. In contrast, our xPG model will give the possession the same value whether it results in the shot or not. xPG measures how likely the possession can create a shot and the risk if the ball was turned over during the possession while xG measures how likely a shot can become a goal.

Next, we define possessions that end in a shot as Successful xPG. Successful xPG differs from xG in that value for the entire possession (Positive xG) is assigned to every involved player, which includes, but is not limited to, the xG of the shot. The sequence above was not  Successful xPG because it did not result in a shot, but if it had, the value of each of the actions in the sequence would have been assigned to each involved player. As such, Successful xPG resembles xGChain, but it is different since xGChain gives only the shot xG value to all the players in the chain, whereas Successful xPG gives the NSxG values of all actions in the chain to each player involved.

Let’s take a look at a Successful xPG example:

Note: Successful xPG only uses the Positive NSxG map.

Note: Successful xPG only uses the Positive NSxG map.

The diagram above shows a series of four consecutive successful actions, beginning with a ball-winning defensive action (1) by the left midfielder and ending with a shot (4) in the yellow zone by the striker. The first three actions: defensive action, successful dribble, and successful pass, total 0.052 Positive xPG. Again, we are using the value from the zone where each action took place to calculate xPG. Any possession sequence which ends in a shot will also receive the possession value in Successful xPG in addition to Positive xPG. The shot uses ASA’s coordinate xG model (rather than xPG’s zonal model) so the actual value may vary from the 0.111 seen here just outside the six-yard box (the yellow shaded box). For simplicity, let’s say the shot’s xG was 0.100. If we add the values of the tackle (0.008), the dribble (0.016) and the pass (0.028), the total value is 0.152 for both Positive xPG and Successful xPG. 0.152 will be added to each player who performed at least one of the four actions. In this example, actions (1), (2), and (3) were all performed by the same player, the left midfielder, who would get all 0.052 of those actions plus the 0.100 shot xG. The same values would also be given to the Positive xPG and Successful xPG of the striker who took the shot.

Finally, we quantify the materialized risk of a possession that ends with a failed pass or dribble as Mistake xPG. Mistake xPG is determined by the total Positive xPG of the following opponent possession. While the other xPG values (Positive, Negative, and Successful) are shared by all players involved in a possession, Mistake xPG is assigned only to the player that lost possession. This allows us to measure identify players who make the most damaging errors.

Let’s look at a Mistake xPG example:

The image at left is the same offensive possession we used above. The image at right is the subsequent possession by the opposing team that took possession on the failed pass across midfield.

The image at left is the same offensive possession we used above. The image at right is the subsequent possession by the opposing team that took possession on the failed pass across midfield.

Let’s revisit our earlier Positive xPG example. The possession on the Positive NSxG map ends with an unsuccessful pass from the defensive midfielder (the red arrow in the image on the left). For Mistake xPG, the opponent’s ensuing possession’s xPG values, overlaid on the Negative NSxG map, are assigned to the player who made the “mistake.” The origin of the defensive action (1) has no Negative xPG value (0.000), but it is followed by actions (2), (3), and (4) which each are successful for a total of -0.027 xPG. The possession is ended by an unsuccessful cross, so the Mistake xPG stops accruing. The defensive midfielder is assigned the Mistake xPG of -0.027. During this possession, the opponent is accruing their own Positive xPG and Negative xPG (not shown).

We now have four xPG variants that each provide us different information about possessions:

  • Positive xPG is the sum of the action values when overlaid on the Positive NSxG map for a particular sequence regardless if the possession ends in a shot or not.
  • Successful xPG is the same as Positive xPG, but it is only used for possession sequences which end in a shot.
  • Negative xPG is the sum of the action values when overlaid on the Negative NSxG map for a particular sequence regardless if the possession ends in a shot or not.
  • Mistake xPG is the Positive xPG of the opponent’s ensuing possession when a player turns the ball over.

Here are a couple game clips to help further explain the concepts of the four xPG variants (get out your earbuds for the audio):

First, one on Positive xPG and Negative xPG with the Columbus Crew

 

Successful xPG and Mistake xPG with the New York Red Bulls:

Now, let's take a look at how each MLS team breaks down in each of the four categories. Values in green show the top 20% and Values in red show the bottom 20%. Remember that high negative xPG is not undesirable unless accompanied by a high Mistake xPG, so the higher values here are expressed with green rather than with red.

MLS 2018 xPG Leaders and Laggards

Full Name Games Avg Successful xPG Avg Positive xPG Avg Negative xPG Avg Mistake xPG
Atlanta United 25 1.68 5.87 9.16 2.26
Chicago 26 1.17 4.77 8.99 2.22
Columbus 25 1.41 5.42 10.30 1.81
Colorado 25 1.16 4.44 8.38 2.32
DC United 22 1.26 5.08 9.06 2.53
FC Dallas 24 1.47 5.45 8.39 2.01
Houston 24 1.56 5.34 7.44 1.81
Los Angeles FC 25 1.48 6.05 9.01 1.87
L.A. Galaxy 26 1.52 5.65 8.36 1.97
Minnesota United 25 1.14 4.87 9.04 2.46
Montreal 26 0.94 4.43 7.83 2.19
New England 24 1.40 5.51 6.96 1.67
New York City FC 25 1.60 6.02 11.99 1.67
New York 24 1.50 5.58 6.55 1.71
Orlando City 24 1.38 5.58 8.76 2.01
Philadelphia 24 1.44 5.98 8.76 2.00
Portland 23 1.51 5.42 8.19 2.04
Salt Lake 26 1.18 4.88 10.17 1.84
Seattle 24 1.32 5.72 9.84 2.22
San Jose 24 1.33 5.11 8.79 2.49
Kansas City 24 1.72 6.61 8.78 1.49
Toronto 24 1.43 5.92 9.05 2.05
Vancouver 25 1.31 4.77 8.17 2.30
Average 24.5 1.39 5.41 8.78 2.04

The Successful, Positive and Negative xPG values at a team level can be misleading in the same way a percent possession statistic can be and can be highly dependent on overall playing style. Teams with more attacking possession will probably have higher Positive xPG because they have more offensive actions in the offensive half of the field. Teams which play out of the back likely have higher Negative xPG because they will have more actions in the defensive half of the field. Regardless of style, Mistake xPG is bad for any team. Looking at teams which create shots out of their Positive xPG and convert it to Successful xPG is useful in knowing if an attacking style is leading to shots. In the same way, teams which have a higher Mistake xPG coming out of their Negative xPG are probably giving up a lot of chances through individual errors. Since combining these variants with each other and with other data increase the usefulness of Expected Possession Goals, here’s a scatterplot which incorporates all four xPG variants.

xpg mls.png

Offensive Success on the x-axis is defined as the ratio between Successful xPG and Positive xPG (the ability of a team to convert offensive possessions into shots). Tidiness on the y-axis is defined as the ratio between Mistake xPG and Negative xPG (the negative value of their turnovers) note: we flipped the y-axis, so Tidiness increases as you go up.

As the Columbus Crew video showed, they accumulate a high amount of Negative xPG without making many costly mistakes. In short, they are “tidy” on the ball. They create space to get forward, but they often lack quality in the final ball resulting in average Offensive Success. For the most part, teams with poor underlying metrics this season, such as Montreal and Minnesota, are generally more “untidy” (low on y-axis). The most “tidy” teams generally play out of the back, like NYCFC and Sporting Kansas City, so they have high Negative xPG, but make few mistakes despite possessing so often in their own third like our Columbus example.

As the New York Red Bulls video showed, they sacrifice tidiness for chance creation. Their games can feel chaotic, but they thrive in that, creating turnovers and Mistake xPG for their opponents while their direct play creates a high amount of shots and Successful xPG for themselves. They tend to have both lower Positive xPG and Negative xPG because they don’t hold onto the ball for long and play the ball as little as possible in risky areas.

Teams in the middle of the graph have had a lot of ups and downs, more downs than ups, with many underachievers here. Real Salt Lake doesn’t make many mistakes but doesn’t do much in the attack either. While this is a view of the entire season, it is also informative to look at xPG metrics from recent games to observe teams’ performance, especially with the upcoming playoff stretch. For example, Matt Doyle wrote, “The Quakes have become much tidier in central midfield since Luis Felipe won the No. 6 spot a month back” in a recent guide to “Rivalry Week” article. With our Tidiness score, we can quantify this observation. Most of the season, San Jose was much more untidy with the ball than league average (blue line with red error band). However, as the season has progressed, and with the addition of Felipe and Guram Kashia, they have improved to the point where they are statistically not different than the league average in Untidiness (see overlap in error bands).
 

sanjose.png

Conclusion

We hope to use xPG to measure every action of the game in an organized manner. These new xPG variants Successful xPG, Negative xPG, and Mistake xPG, work well, along with the original Positive xPG, to measure game state and a possession’s outcome. However, for analyzing the performance of individual players, we still see many issues to address.

xPG takes the possession sequence concept to the extreme: regardless which step a player participates in a chain of events, the player gets an equal merit like everyone else in the chain. In this sense, it is similar to xGChain, except xPG extends the possession chaining concept to every event in the match and includes possessions that do not end in shots:

Chicago Fire show the usefulness of xPG: https://vimeo.com/286457180

In this possession, Bastian Schweinsteiger made the most important pass that created the space for his teammate to attack. But xG and its derivatives do not record this pass at all as it did not end in a shot. Therefore, we need a system to award good play like this, hence the equal distribution of xPG for everyone within each possession.

However, we also know that not all events in a possession should receive equal credit. Perhaps a central defender shouldn’t get as much credit as a deep-lying midfielder, as only the latter receives intense pressure from an opponent. The actions of a deep-lying midfielder in a possession are usually more difficult to make than the central defender’s and should be credited as such. Neither xPG nor xGChain/xB allow for that distribution of merit.

Stay tuned to American Soccer Analysis as we boldly go to explore these emerging metrics in their various forms in Expected Possession Goals: The Next Generation - Part 2 and beyond.

We’d like to acknowledge that other work being done on “non-shot” uses of Expected Goals (xG). Non-shot Expected Goals (NSxG) uses xG values for actions which are not shots or goals based on the spot they occur on the pitch and the xG value a shot would receive from the same spot. Specifically, Mark Taylor of The Power of Goals blog recently has used an non-shot xG model to evaluate player passing efficiency and other player actions in the English Premier League. But the xPG metric uses “non-shot xG” differently in that it groups the possession together as chains of organized events and not individual chaotic occurrences. We believe there is room for multiple NSxG models, and that each can analyze soccer from different perspectives. American Soccer Analysis already maintains weekly updates on Expected Goal Chains (xGChain) and Expected Buildup Goal Chains (xB or xBuildupGC), derivative xG metrics that evaluate the players who contribute to a shot or goal. The emerging xPG metric, takes xG farther by evaluating all possessions regardless if the end result is a shot or goal.