Where Will We Find the American Messi?

By Chris Marciniak

“Maybe we can find someone kicking a ball around the streets. Maybe there’s a Messi hiding somewhere here in the States. Who knows?" 

In this quote given to FIFA.com in 2014, U.S. Men’s national team coach and technical director Jurgen Klinsmann revealed three things about U.S. Soccer. The ambition to be the best and produce elite talent, the sense in which we are overlooking top players in our midst, and lastly we have no idea if these players exist or not. He explained his intention to “look under every rock, in every dusty corner.” He added that there is “definitely talent in the U.S. that’s not being tapped” and that the federation is trying to get their “heads and hands around that.”

The following analysis presents a way to get our heads around the strengths and weaknesses of the U.S. Soccer development system. It was put together for the U.S. Soccer Hackathon with the help of Andrew Koper and Matt Mccluskey. We took the hometown of all youth national team selections and joined that with county-level population and income data from the Census Bureau. We then ran a regression to find the expected number of call-ups for each county given its population and income. Code and data are available here.

The influence of money on opportunity in American youth soccer has been the subject of much hand-wringing. Rather than recriminating the pay-for-play system we should acknowledge that money helps the development of soccer players. As Soccernomics co-author Simon Kuper put it, one of the biggest misconceptions is that the best players “come from poverty, because that gives them ‘hunger.’ In fact the best players overwhelmingly come from Western Europe (the region with the least poverty).”  Areas of relative poverty within rich countries that produce soccer players like the Paris suburbs do so with public investment in fields and futsal courts as well as strong social safety nets.

Examining which regions are producing the bulk of talent for U.S. Soccer can yield lessons both for clubs and the national federation. We will assume that if a region is producing more players than expected it reflects better than average coaching and development systems. If a county produces fewer players than expected that could either point to a lack of investment in soccer infrastructure or it could be the location of the next American soccer star.

Data Collection

We scraped U.S. Youth National Team Roster Announcements from U.S. Soccer from https://ussoccer.com using the Scrapy framework and parsed them with regular expressions.  Small variations in roster text can break our text extraction. While we do not have every roster it is reasonable to assume these errors occur in a random fashion.  Our resulting sample consists of 48 roster announcements from 2015-2018 and contains 475 unique players from 144 counties.

Our regression shows that both income and population have an effect on the number of U.S. Youth national team call-ups that a county provides. While these effects are statistically significant they are small in magnitude. To put our coefficients in perspective, other things equal a $100,000 increase in the median county income increases the number of players expected to be called up by 1. Similarly, we would expect a county to have another national team player for every 500,000 people within its borders. The small impacts of these variables suggests that other soccer specific factors have more influence on player development.

The Pacific Northwest

pnw.png

The Seattle Sounders show just how much a well-run club can influence the development of top talent. King County lent 18 players to youth national teams, 12 more than expected based on income and population alone. The Portland Timbers are in line with expectation.

California

cali.png

California has talent hotspots in Los Angeles and San Diego counties. San Diego, though without a professional club, has been able to take advantage of the development infrastructure of the Tijuana Xolos. The Los Angeles Galaxy have been ahead of the curve with respect to youth development and signing homegrown players. They were recently joined in the league by LAFC who will increase competition for recruits. San Bernandino County may be the biggest untapped scouting opportunity for the two clubs with our model suggesting there are three unfound top prospects.

Southwest

sw.png

Phoenix and Las Vegas are two underperforming major metropolitan areas. Both have second division teams with MLS ambitions. The Phoenix area has been home to Real Salt Lake’s development academy in Casa Grande. Barcelona has established an American academy in the region. Despite these external attempts to develop the region Maricopa County has only had six prospects, five fewer than expected. The founding of Las Vegas Lights F.C. in the USL has renewed the potential of the market for a first division club. In terms of youth development the area produced four fewer national team players than expected.

Texas

texas.png

There is a dearth of talent coming from the greater Houston area. Harris County produced 5.2 fewer players than expected. F.C. Dallas has long been considered the powerhouse of youth development in the U.S. Their talent base is concentrated in Dallas and Denton counties. We would expect more talent to come from Tarrant County, which contains Fort Worth as well as Collin County, home of F.C. Dallas’ stadium and development centers in Frisco, Texas. The club has done a great job of developing talent and they have begun to recruit from other states like Alabama and Arkansas. This map suggests that there may still be undiscovered talent in their backyard.

The Midwest

mw.png

The Midwest is marked by underperforming metropolitan areas. Detroit lacks a professional club in the top two divisions. There has been talk of an expansion team led by Quicken Loans founder and Cavaliers owner Dan Gilbert. In the meantime, lack of investment has shown itself to be a problem for the region. It has produced four fewer prospects than expected.

Chicago’s Cook County has underperformed despite being home to the U.S. Soccer Federation and the MLS’s Chicago Fire. Even though it provided nine players in our observation period, that was still 2.53 fewer than expected. While the low performance of Cook County could read as another indictment of U.S. Soccer Federation it also means that they have a local laboratory to experiment and rectify the situation.

The Southeast

se.png

Atlanta United Academy, formerly Georgia United Academy can take credit for nurturing talent in the region. Fulton and Gwinett county combined yielded five more players than expected.

Contrary to other non-MLS markets, Wake County, which conatins Raleigh-Durham produced five move prospects than expected. This could be an area for further study on how to develop players outside of professional settings. It is also worth noting that several top prospects have been poached from the area to join MLS academies in other cities.

Florida

fl.png

Miami should benefit from David Beckham’s incoming Inter Miami Football Club. Co-founder Jorge Mas has said that he wants to “break the pay to play model” by investing in 250 youth academy players. Already Broward County produces talent at an above average rate. Mas and Beckham would do well to scout and develop talent in Miami-Dade. In a geographically similar pattern, Orlando City Football Club is drawing well from Seminole County but less so from Orange County.

The Northeast/Mid-Atlantic

ne.png

The Mid-Atlantic corredor boasts five MLS development academies including D.C. United, Philadelphia Union, New York Red Bulls, New York City Football Club, and the New England Revolution. When it comes to the senior national team New Jersey and the D.C. area have produced a disproportionate number of players. We can see that the Red Bulls, Union, and United have been successful establishing themselves in their respective regions. There are however regions surrounding these clubs that underperform such as Long Island, and Fairfax County, Virginia. Boston is a particularly weak region for talent development. The Revolution may want to review their development practices.

Conclusion

The challenges facing U.S. soccer are vast but not insurmountable when broken down into small pieces. MLS expansion in Phoenix, Detroit, and Las Vegas could dramatically improve talent development in those cities. Clubs in Boston, Chicago, and Houston may want to review their youth operations. If there is a Messi kicking a ball in the streets, he could be hiding in any one of the suburban counties like San Bernandino, Fairfax, and Collin that lie just outside of the talent centers that MLS academies scout and cultivate. If the United States is to reach its full potential as a soccer nation it needs to cut down the economic and social barriers preventing people from accessing the game. Doing so will take investment from professional clubs, non-profits, parents, and volunteers. While the federation has stumbled, the ambition and heart of the community remain intact, pushing forward.