Prediction versus Explanation

There is a subtle, yet very important, distinction between explanation and prediction in most sports, and Major League Soccer is no different. I don’t intend to make this long or particularly math heavy, so hang on. Here’s a simple example of what I’m talking about when I refer to explanation. In its first six games of the season, the Portland Timbers recorded 89 attempts and allowed just 57 to their opponents. During that same time, Portland scored ten goals while allowing eight. I might explain that the Timbers’ +2 goal differential was due—at least in part—to earning more offensive opportunities than their opponents.

Here’s another example, but this time in regards to prediction. In their first six games, the New England Revolution scored two goals while allowing six to its opponents. During its next six games, New England scored eight goals while allowing just three to its opponents. Using just New England as an example, it would seem as though goal scoring in the past (-4) poorly predicted goal scoring in the future (+5).

Of course, we have nineteen teams, not two, so I sorted through all nineteen teams looking for patterns. Here is what I found.

A team’s goal differential during its first six games explained its total points over that same time period extremely well (R2 was 77%). This is not surprising. Teams that tend to score more goals than their opponents also tend to win more games. Nothing shocking there.

However, a team’s goal differential in the first six games of the season provided no help in predicting its total points over the next six games. Here’s the plot on that one:

GD vs. Future Points - 6 weeks 2013

There is virtually no relationship between how well a team scored before, and then how many points it earned later. In other words, goal differentials are not predictive over six games.

But if you’re convinced the lack of predictive ability is completely due to a small sample size of twelve total games, check this out. A team’s attempts differential in its first six games shows a statistically significant correlation to both its future goal differential and points earned:

AD vs. GD and AD vs. Pts

 

Because it’s sports, prediction is never going to be precise, and these aren't perfect correlations at all. But I find it particularly impressive that over just twelve total games, the attempts data from a team’s first six games shows statistically significant predictive ability of the team’s results in the next six games.

If you’ve listened to our Game-of-the-Week section during our podcasts, you hear us talking a lot about shot ratios. This post hopefully clarified why we do that. Past shot ratios are better than past results at predicting future results.