February 24, 2017

Seattle Sounders 2017 Season Preview

February 24, 2017/ Benjamin Harrison

One is tempted - given the Seattle Sounders' dramatic recovery of a seemingly lost 2016 season to seize a playoff berth, and, ultimately, the MLS Cup - to take those last 14 games (plus the playoffs) as the best sign of what the team has to offer in the coming season. But with new acquisitions bolstering the bench, players developing in key positions, others returning from injury, and still others adjusting adjusting to the league, the team could easily see improvements over the championship campaign. Designated Player Clint Dempsey was available for only four games of Seattle's stretch run thanks to a heart condition, but is now cleared to play. Brad Evans struggled with injuries throughout the last half of the season. Young starters Jordan Morris and Cristian Roldan are a year older and more experienced. Left back Joevin Jones is entering his physical prime. Even if the Sounders have not put the dire days fully behind them, this is a team that should expect to make the playoffs and contend in the postseason.

March 02, 2016

2016 ASA PREVIEW: VANCOUVER WHITECAPS

March 02, 2016/ Benjamin Harrison

On September 19th, 2015 the Vancouver Whitecaps led the race for the MLS Supporters’ Shield. From then, the team fell victim to an almost-comical trend of league leaders performing like cellar dwellers, collecting five points from their last six games and backing into the playoffs (inasmuch as a second seed can back into anything). Vancouver bowed out of the playoffs on their own turf, losing 2-0 against Portland to follow up on a scoreless draw down south, landing only 5 of 22 shots on target over the two-leg series. At their best, the Whitecaps are a dangerous counterattacking team that overwhelms opposing defenses with an athletic attacking midfield and aggressive passing (note the high total shot ratio of 0.532). At their worst, the team looks much the same… but wastes the ball with poor shot selection and lost possession (note the possession ratio at 0.469, third worst in the league).

2015 in Review

Drew’s 2015 ASA preview called attention to a young and promising attack, but raised questions concerning Vancouver’s defensive strength with a new pair of centerbacks. Ultimately, the Whitecaps defense significantly improved from 2014, ranking second in goals allowed and first in xGA, on the strength of Matias Laba, Kendall Waston, and an outstanding year from goalkeeper David Ousted. Waston and Laba together account for roughly 34-35% of the team’s defensive actions (excluding recoveries and fouls), reflecting the former’s physical dominance (particularly in the air) and the latter’s exceptional activity rate in the defensive midfield. No individual attacker stepped up as a consistent scoring threat across the full season, with streaky production from forward Octavio Rivero and midfielders Kekuta Manneh, Pedro Morales, and Christian Techera.

More on the keepers and defense after the jump.

March 02, 2016

2016 ASA PREVIEW: MONTREAL IMPACT

March 02, 2016/ Benjamin Harrison

The 2016 Montreal Impact will be eager to discover whether they can sustain the late season form that propelled them into their second playoff appearance in MLS. There’s hope in the rumor-defying return of Didier Drogba, who carried the team to a 7-1-1 record in his nine starts (scoring 11 goals) to close 2015. Nevertheless, five of those wins came at home, and three came against Colorado and Chicago. Mauro Biello imposed relatively few changes to the roster in his first offseason as head coach, likely indicating some confidence that the changes made last fall are sustainable.

2015 in review

ASA’s 2015 season preview of the Impact projected a position roughly between the cellar and the last playoff seeds – a fair summation of the team’s performance before Biello took over at the end of August. A defensive overhaul cut 14 goals off 2014’s abysmal total of 58 – third worst in the league – with new arrivals taking charge of the defensive midfield and all four positions along the back line. Laurent Ciman (CB), Marco Donadel (DM), and Ambroise Oyongo (RB) arrived perhaps with the greatest fanfare. 23 year-old Angentinian centerback Victor Cabrera, on loan from River Plate, seized the permanent starting role alongside Ciman in late June, and the Impact allowed only 18 goals in his 16 starts from that point.

Before Drogba’s arrival, Montreal’s offense was susceptible to stagnation, overly reliant on the individual skill of Ignacio Piatti in the attacking midfield. Neither Dominic Oduro (poor in distribution) nor Jack McInerney (terrible at everything else) was able to present a consistent threat from striker. Oyongo produced little to show for his promise as an attacking fullback. Dilly Duka and Andres Romero provided only modest support from the attacking midfield. Despite the defensive improvement, the Impact remained at a negative GD before the late season surge.

A look at the goalkeeper and defense after the jump.

May 04, 2015

"Positions" are a lie.

May 04, 2015/ Benjamin Harrison

By Benjamin Harrison (@NimajnebKH)

The idea of a player “position” is too inflexible.

We know – as fans – that that there are more than 11 different types of soccer players. We simply assign them titles which match a variety of on field roles, and some of those labels fit better than others. A “defensive” midfielder may also be a holding midfielder, is likely a central midfielder, and could even be a deep-lying playmaker. We may use the more nuanced terminology in a basic narrative description of game play – but there is no standard definition for how those roles might translate into measurable events. Soccer analytics is often left with a set of basic positions to categorize play on the field. These are reflected fairly well in the most basic statistics measured by OPTA. Consider a set of 209 players receiving starts over the 2014 season:

The raw data here is collected from whoscored.com. Pass attempts per 90’ accordingly excludes crosses and set pieces. “Defensive actions” are all tackles (successful or not) interceptions, clearances, and blocks. Where deemed useful, I used the position selection option from whoscored (this is an extremely useful tool for reasons that will hopefully become evident over the course of this post) to restrict the player to a dataset which fit into an assembled 11-man lineup (only 11 starters- a potential lineup, were chosen from each team). Although positional differences are apparent in the basic biplot, the accumulation of passes and defensive actions also incorporates aspects of style – the pace of play – which vary considerably by team. To remove team context, I summed up the pass and defense rates by team and converted the axes to share of team actions for the 2014 dataset.

We’ll be using the 2015 dataset (raw data collected from whoscored as of April 23rd) through the remainder of this post. These 232 data points have been assembled using a slightly different approach – collecting all player statistics with a cutoff of 270 minutes game time, and normalizing individual numbers to the team average. Players who change positions between games should be expected to blur some position-specific distinctions, but major changes in player role are infrequent enough to be overwhelmed by the general trends. Despite the modest differences in method, the two plots exhibit predictably comparable values – there are a finite number of actions teams can take in a game, and a limited number of general tactical formations used in MLS (and soccer, in general).

The modified plot clarifies how the team uses the particular player as a share of its overall play. When the plot is constrained to a team-specific lineup, it can be a useful tool for visualizing average tactical setup, changes between seasons/games, and tactical adjustments to game state (check out the three links for some handy case studies specific to Seattle Sounders play). Positional differences remain apparent, but considerable overlap persists between categories, and their range implies poorly-matched roles. So long as a “midfielder” can have the same share of team actions as both a striker and a central defender, it remains a poor label. Overly broad player categories force the statistical comparison of different player roles having vastly different circumstantial difficulty (see, for example, this study of players with similar attacking midfield roles to Lamar Neagle). Often, difficult behavior is associated with exactly those aspects of play that lead to team success:

“Chances” are defined here as the sum of all assists, key passes, and shots. Offensive “touches” are the sum of basic passes, cross attempts, and shots. Evaluating player performance based on skill-dependent statistics is dependent upon a thorough assessment of player behavior. We need player typing to be as diverse as on-field roles, and as indifferent to nominal “position” as possible. The statistics used to characterize type should be characteristic of role and as far removed as possible from player quality/skill (e.g., shooting rate should discriminate attacking players, but the ability to generate shots is descriptive of quality, so it is not useful as a role-dependent statistic). Finally, we shouldn't use so many statistics in constructing a model of roles such that the result becomes overfit to specific players or contains redundancy (e.g. including two different types of basic passing rates – say, short passes and long passes – would exaggerate role difference specific to distribution).

For now, with the 2015 dataset, I assessed pass and defense share as described above.Goalkeepers have been excluded (it is interesting to include them in team analysis, but their position label is relatively effective). I also calculated and recorded dribbles/touch (measuring attacking style on the ball) and crosses per touch (wide vs. central play). I then relativized each of these four role indices to its 210-player maximum and performed a hierarchical cluster analysis on the resulting data matrix:

I chose a position for pruning the tree (dashed line) that identifies 15 discrete player clusters grouped by role similarity by the four indices (this step is arbitrary this time, but will be automated in the future). Alongside each, I’ve roughly characterized the differences picked up in the analysis on a scale of --- (well-below average) to 0 (average) to +++ (well-above). Notice, if we move the cutoff line to the left to define only 3 groups, these would be primary defenders at the top, wide players in the middle, and central attackers at the bottom. Running a principal components analysis on the same dataset, let’s take a look at the differences between nominal position and cluster identity on the two first axes of variation.

The overlap problem with position is considerably reduced (though not absent) with cluster identity. To be useful, the cluster identities must also exhibit superior discrimination of role difficulty. Short pass accuracy is a skill-dependent statistic, but highly variable depending on situation:

Here the short pass accuracy by position is compared to that by cluster (cluster 11 is excluded, since it is simply Fabian Castillo – the point guard man who never encountered a ball he didn't want to dribble past an opponent). Many clusters exhibit a substantially tighter range of values than for the position counterparts – remember that these categories have not been defined by any values that explicitly measure skill or quality. Within clusters (or between closely related clusters) players should show similar statistical performance unless otherwise influenced by skill (as shown with the previously linked example concerning Neagle). No matter how well we characterize situational difficulty (e.g. how far from goal a shot is taken, or the direction, location and length of a pass), constraining the performance of peers provides a more complete characterization of expected result.

Providing context for player evaluation is only part of the value of this approach. The performance of individual players is strongly controlled by myriad factors even beyond team and role context. Grouping similar players may allow us to address questions that would be otherwise complicated by sample size. Take, for example, the question of whether any player can be considered to overperform or underperform expected goals.

If a style-specific skill in finishing exists, the grouping of similar players – with the resulting increase in sample size – might allow its detection more readily than would be the case measuring goal records for an individual player subject to seasonal noise, team context, and age-related development trends. However, the modest differences between xG and G in the data above should probably be considered a vindication of the model, if anything. Attackers with substantially different on-field roles and shot selection still exhibit predicted finishing success. Still, this approach may warrant further testing in the future with more refined role discrimination and a larger dataset.

The four-index model above warrants more work. Some player groups are very effective, but others clearly could benefit from different weighting prior to clustering and/or additional indices. Take, for example, cluster 15 which mainly incorporates central attacking players with fairly average pass share. The cluster also picked up Vancouver CB Pa Modou Kah, who has exhibited abnormally low pass and defense shares for his role so far in 2015. The present dataset may also suffer from limited sample size (any set of a few games may lead to some very unusual game states and corresponding performance). Nevertheless, preliminary work suggests player typing may be a useful analytical tool.

American Soccer Analysis