Introducing itscalledsoccer
/By Brian Greenwood and Tyler Richardett
For the latter half of 2021, American Soccer Analysis has been working on an R and Python library to make it easier to programmatically interact with our data. Today, we are happy to announce the release of the library which we've dubbed itscalledsoccer. If you want to get started right away, the library is available for download from CRAN and PyPI, and the source code can be found here.
In this article, we'll talk a bit about how we built the library and walk through a basic example or two.
Writing A Library
At a high level, itscalledsoccer is just a wrapper around the American Soccer Analysis API that powers our interactive tables. Many functions the library makes available are just API calls with some sensible defaults. When initially designing the library, we had considered writing the core functionality in C++ and having both versions directly call the shared C++ code, much like how xgboost is written. But given that we didn't have a lot of C++ expertise, we decided to maintain separate codebases for the R and Python versions. So in essence, the versions are two separate entities that we manually keep in sync. Both versions have caching enabled, so when making repeated calls, the results are stored locally, which speeds up performance and reduces load on the API. We have rate limiting enabled on the API, and the library gracefully handles this by executing large queries in batches.
How To Use itscalledsoccer
# Python
pip install itscalledsoccer
# R
install.packages("itscalledsoccer")
Once the library is installed, you'll want to import it and create a client.
# Python
from itscalledsoccer.client import AmericanSoccerAnalysis
asa_client = AmericanSoccerAnalysis()
# R
library(itscalledsoccer)
asa_client <- AmericanSoccerAnalysis$new()
Once you have a client, querying data can be done by calling any of the get_* functions. For instance, if I wanted to get goals added for a particular Philadelphia Union goalkeeper, I would use the get_goalkeeper_goals_added function. All the library functions return a data frame.
# Python
df = asa_client.get_goalkeeper_goals_added(leagues="mls", player_names="Andre Blake")
# R
df <- asa_client$get_goalkeeper_goals_added(leagues = "mls", player_names = "Andre Blake") %>% tidyr::unnest(data)
team_id | minutes_played | action_type | goals_added_raw | goals_added_above_avg | count_actions | competition | |
---|---|---|---|---|---|---|---|
9z5k7Yg5A3 | 17805 | Claiming | -0.2881 | -0.2653 | 279 | mls | |
9z5k7Yg5A3 | 17805 | Fielding | -1.1517 | -0.1366 | 2279 | mls | |
9z5k7Yg5A3 | 17805 | Handling | -0.8955 | -0.9808 | 566 | mls | |
9z5k7Yg5A3 | 17805 | Passing | 11.3168 | -0.205 | 5203 | mls | |
9z5k7Yg5A3 | 17805 | Shotstopping | -1.4373 | 3.0576 | 805 | mls | |
9z5k7Yg5A3 | 17805 | Sweeping | -0.4462 | -0.1893 | 373 | mls |
If I wanted to get demographic data of all past and present NWSL managers, I would use the get_managers function.
# Python
df = asa_client.get_managers(leagues="nwsl")
# R
df <- asa_client$get_managers(leagues = "nwsl")
manager_name | nationality | |
---|---|---|
Amy LePeilbet | USA | |
Becky Burleigh | USA | |
Christy Holly | Northern Ireland | |
Craig Harrington | England | |
David Hodgson | ||
Denise Reddy | USA | |
… | … |
Looking ahead
Again, in the interest of openness (and of getting others to do our work for us), we’d like to highlight that all of our source code for the library is publicly available on GitHub. If you’d like to report a bug or request a new feature, please open an issue. Or, if you’d like to contribute code, we’ve left some instructions there as well. Our application and API do use modest compute resources, so we ask that you be mindful when using the library.
We plan to add new features over time, hopefully starting with a CLI. We hope that itscalledsoccer provides another useful way for folks to interact with our data, and if you do make something with the library we just ask that you give us proper attribution credit. And we would love for you to share it with us!