The Replication Project: Is xG The Best Predictor of Future Results?

This is the first article of what we are terming The Replication Project where we take an important soccer analytics finding from yesteryear and see if it still holds up with modern data. While this can be just a straightforward replication, it can also lead down some rabbit holes as you will find in this first installment where we look at whether the claim that xG is the best predictor of future performance still holds up.

Read More

Where Goals Come From: Using past goals to create future goals

Where Goals Come From: Using past goals to create future goals

The outline for this article is going to be:

  • If you’ve heard about or looked at xG in the past but either 1) didn't see its utility or 2) didn't know how to make it useful, we want to help with these scenarios in this article and upcoming articles.

  • xG is always improving, so regardless of what you saw or read about a few years ago, it is much better now at evaluating individual shots because of better and more data.

  • Not all xG values from various sources are equal because there is not equal access to the data points and data volume, and because data providers, clubs, and analysts have varying ideas on how to value shots and optimize their models.

  • There are other stats and metrics that are not talked about as much as xG but can also be very useful in addition to or along with xG. Some may be better suited to your audience.

  • xG helps us answer the quality question about a shot, and we'll be talking about improving shot quality utilizing xG and other tools throughout this season. Without xG, shot quality becomes highly subjective and experiential.

Read More

Coaches Reward Goalscorers. But Should They?

Coaches Reward Goalscorers. But Should They?

On March 30, 2019, the 16-year-old midfielder Gianluca Busio came on for Sporting Kansas City in a rout of Montreal. He didn’t do a whole lot in his half hour on the pitch—seven of his eight completed passes went backwards—but in the 78th minute he poked the ball away from a center back and slotted home his team’s sixth goal. The next week Busio was rewarded with a full 90 minutes and he scored again. The week after that, another appearance, a third straight goal. Coach Peter Vermes was sticking with the red-hot kid and it was paying off.

Alas, not all breakthroughs go as smoothly as Busio’s. On July 17, a teenage striker named Theo Bair earned his second career start for Vancouver. He made a couple of promising runs where he held off a New England defender and found a shot from a low cross, but neither chance connected. The first hit the far post and ricocheted out. Two minutes later, Bair reached back for a bouncing pass at the top of the six-yard box but couldn’t quite corral it. The shot sailed over the crossbar from embarrassingly close range and Bair tumbled head over heels into the goal, where he slapped the grass in frustration. He was subbed off, and next game he only appeared for the last 14 minutes.

Read More

There's Something "a miss" in Wondo's Legacy

Christopher Wondolowski should be an American sports icon. He should be beloved and admired. If he is hated by anyone, it should be by MLS fans in the same way Indianapolis Colts fans “hate” Tom Brady. He is the underdog of underdogs – the working class man who beats the talented elite at their own game. At 36, he keeps breaking scoring records in MLS, including setting the all-time big one a few weeks ago with a four-goal match. He is on the precipice of being the first player to score 10+ goals in 10 straight MLS seasons. His time and opportunity with the US Men’s National Team should have been longer than it was – but for many fans, there would be no cry for Wondolowski’s return to the national team. No matter how many goals he scored or how often his league form was more impressive than the strikers getting the call, his national team legacy was cemented. Outside of a few San Jose Earthquakes fans and pundits, there are no calls for “Wondo” to be on the team by the American soccer public because of one infamous situation that occurred on July 1, 2014.

Read More

Expected Narratives: Have Some Ambition

Expected Narratives: Have Some Ambition

Narrative: Ambition Rankings

If there is one day on the MLS calendar that I dread with a clarity and purity often seen only in very expensive diamonds (let’s call them “diamonds of ambition”), it’s Grant Wahl’s annual musings on which MLS teams have proven their ambition the most. For those unaware, every year our nation’s preeminent soccer scribe sends out a questionnaire to every MLS team asking them to flex their financial bonafides and then ranks them according to how expensive their DPs are, whether or not they get good crowds, and that “it” factor that you can’t explain but Grant knows it when he sees it. Unsurprisingly, Atlanta tops this year’s list and Colorado pulls up the rear, but the middle is just gluttonously full of incisive takes. “We’ve invested 10 million dollars in our academy says one team”, “oh yeah well WE expanded our stadium so suck it” says another. “Tell me more” says Grant Wahl, and we’re left with a bunch of people squabbling over whether Jan Gregus or Pedro Santos is a more ambitious signing.

Read More

Reep Revisited

Reep Revisited

I recently created a decent set of MLS possession data while working on another project, and I was curious if the patterns of the famous Reep analysis would hold for MLS. Thus, I attempted to replicate his result, and perhaps offer a couple new perspectives to the data.

I was first introduced to the legacy of Charles Reep while reading The Numbers Game (by Chris Anderson & David Sally). Reep was an early advocate for applying statistics to soccer, and was famous for tracking game events by hand over many seasons. According to his data, most goals were scored from possessions with three passes or fewer. And this was taken as empirical justification to play directly; minimizing the touches with longer passes in order to improve results.

Although Reep’s status as a pioneer in the sport is secure, many still debate the results and interpretation. Some critiques assert the underlying data was misinterpreted. Highlighting a simple majority of goals may not be the best analysis when most possessions had three or fewer passes anyway. Others suggest the structure of the analysis confuses correlation with causation; leading to misapplication of the results. In short, one can’t tell if the results were caused by the number of passes, or whether some other factors have causal roles. As I attempt to recreate the analysis; it’s worth stating the same criticisms and critiques apply to this replication effort as well.

Read More

Model Update: Coefficient Blending

Model Update: Coefficient Blending

With our most recent app update, you might notice that some numbers in the xGoals tables have changed for past years where it wouldn’t normally make sense to see changes. As an example, Josef Martinez had 29.2 xG in 2018, but updated app shows 28.7 (-1.7%). No, this is not an Atlanta effect, though I can understand why you might support such an effect. Gyasi Zardes lost 0.5 xG as well (-2.4%), and no one dislikes Columbus.

We have updated our xGoal models with the 2018 season’s data, and that is the culprit of all the discrepancies since the last version of the app. I have already cited the largest two discrepancies by magnitude, so this isn’t some major overhaul of the model. In fact, only 2018’s xG values have been materially adjusted.* The new model estimated 35.6 fewer xGoals in 2018 than it did before, equivalent to a 2.8% drop.

Read More

A Tale of Two Central Defensive Midfielders

Michael Bradley and Wil Trapp share several obvious qualities. They are both captains for club and country. They are both smooth passing defensive midfielders, and they both possess excellent heads of hair. Another similarity is that they rarely shoot or score goals, each collecting only one goal over the last three seasons. Coincidentally, both of those goals are what we could enthusiastically describe as "wonder-goals." Bradley's long-distance chip for the US national team in a World Cup qualifier against Mexico at the Azteca (a goal not remembered as fondly as it deserves due to the rest of qualifying) and Trapp for the Crew to win a match in stoppage time against Orlando City this past summer. However, one difference between these two players was how each responded to the confidence boost that came after scoring a once-in-a-career goal.

Read More

You Down with t-SNE?

You Down with t-SNE?

We all know that some teams play a certain style, Red Bulls play with high pressure and direct attacks, Vancouver crosses the ball, Columbus possesses the ball from the back. Although we know these things intuitively, we can use analytical methods to group teams as well. Doing so seems unnecessary when we have all these descriptors like press-resistance, overload, trequartista-shadow striker hybrid, gegenthrowins, mobile regista, releasing, Colorado Countercounter gambits...etc (we actually don’t know what some of these terms mean and may have made some up, but the real ones are popular so just google them yourself). Those terms are nice, but no qualitative descriptor can tell us how the styles of New York City and Columbus differ from each other. We need to measure, compare, and model two teams’ playing styles and efficiencies. If we are able to do these things we may be in a position to answer what style really is.

Read More