eddwebster/football_analytics
A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), with links to publicly available resources in the football analytics community.
repo name | eddwebster/football_analytics |
repo link | https://github.com/eddwebster/football_analytics |
homepage | |
language | Jupyter Notebook |
size (curr.) | 555151 kB |
stars (curr.) | 90 |
created | 2020-09-01 |
license | |
Edd Webster Football Analytics
This repository is a public space for the football analytics projects by Edd Webster and a list of publicly available resources published by the football analytics community.
I am currently rewriting this README to include links not only to my own work, but also to include a concise list of learning resources, data sources, libraries, papers, blogs, podcasts, etc., created by all those that have made contributions to the football analytics community. This is currently in progress and could still do with a bit of editing, but most of the content is now available below. If you can think of any resources that I’ve missed, feel free to create a pull request or send me a message. Credits to the Soccer Analytics Handbook by Devin Pleuler, the Awesome Soccer Analytics by Matias Mascioto, and Jan Van Haaren’s Soccer Analytics 2020 Review, which were all used to plug gaps in the list once it was published.
If you like the repo, please feel free to give it a :star: (top right). Cheers!
:wave: About This Repository and Author
Please note, all the work produced in this repository is mine and/or credited to the publicly produced code, data, and/or libraries used, and is in no way related to the work and analysis I produce for my employers.
For more information about this repository and the author, I’m available through all the following channels:
- eddwebster.com;
- edd.j.webster@gmail.com;
- @eddwebster;
- linkedin.com/in/eddwebster;
- github/eddwebster; and
- public.tableau.com/profile/edd.webster.
:clipboard: Contents:
:notebook_with_decorative_cover: Notebooks
For code, see the notebooks subfolder, in which the workflow is divided into the following:
- Webscraping;
- Data Parsing;
- Data Engineering;
- Machine Learning; and
- Data Analysis - projects include working with Tracking data, constructing VAEP models (as introduced by SciSports), building xG models using Logistic Regression, Decision Trees and XGBoost, and analysing player similarity using PCA and Factor Analysis.
:bar_chart: Data Visualisation and Tableau
For Tableau dashboards produced using the data engineered in the notebooks in this repository, please see my Tableau Public profile: public.tableau.com/profile/edd.webster.
- WSL dashboards and analysis [link];
- ‘Big 5’ European leagues dashboards and analysis [link];
- EFL dashboards and analysis [link];
- StrataBet Chance dashboards and analysis [link]; and
- Opta #mcfcanalytics dashboards and analysis [link].
:floppy_disk: Data Sources
The following data sources have been used in this repository. Due to the 100mb file size limitation in GitHub, all engineered datasets prepared in this repository have been exported and made publicly available to view and download in Google Drive. Please see the following [link]. However, all code in this repository should enable you to scrape, parse, and engineer the datasets to the format in which I have analysed and visualised the data in this repo.
Data sources featured in this repository include:
- DAVIES estimated player evaluation data by Sam Goldberg and Mike Imburgio for American Soccer Analysis;
- ELO club rankings. See their API [link];
- FIFA 15-21 player rating data scraped from SoFIFA by Stefano Leone;
- KPMG Football Benchmark player valuation data;
- Last Row Tracking-like data by Ricardo Tavares. See the Liverpool Analytics Challenge for which this data was used (winners discussed on #FoT [link]);
- Metrica Sports Sample Tracking and corresponding Event data. For code to work with this data, see the
LaurieOnTracking
GitHub repo by Laurie Shaw and the corresponding Friends of Tracking tutorials; - Opta Sports match-by-match aggregated player performance data for the 11/12 season and F24 Event data for a 11/12 match of Manchester City vs. Bolton Wanders [link] as part of the #mcfcanalytics initiative;
- Signality Tracking data. The password to download the data is not publicly available, but can be found in the Uppsala Mathematical Modelling of Football Slack group [link]. For access, contact Novosom Salvador Twitter and rsalvadords@gmail.com, or feel free to contact myself. Note, that the 2nd half of the Hammarby-Örebro match is incomplete;
- SkillCorner broadcast Tracking Open data;
- StatsBomb Open Event data;
- StatsBomb season-on-season aggregated player performance data scraped via FBref using Parth Athale’s
Scrape-FBref-data
scraper, which in turn was written using code from Christopher Martin’s repository; - Stats Perform and Centre Circle Canadian Premiere League Event data. See Google Drive [link];
- StrataData from StrataBet Chance shooting data;
- TransferMarket player bio and fiscal data scraped using the
Tyrone Mings
Python TransferMarkt webscraper by FCrSTATS (I’ve currently submitted a pull request to fix issues with this library to scrape bio-status data, see my [TransferMarkt scraping notebook] for code with minor fixes to enable code to run); - Understat shooting and meta data including player xG values, scraped using the
understatr
R package. Data also made available by @NdyStats (see pinned tweet of his Twitter account for the latest version) using code created by both him and Mark Wilkins (see Tweet [link]); - Wyscout Event data Event data for the 17/18 season for the ‘Big 5’ European leagues, Euro 2016 Championship, and 2018 World Cup made available by Luca Pappalardo, Alessio Rossi, and Paolo Cintia. See their paper A public data set of spatio-temporal match events in soccer competitions and the GitHub repo of code made available through Friends of Tracking [link].
- Reference data:
- League-wide xT values from the 2017-18 Premier League season (12x8 grid) by Karun Singh [link]
- EPV grid by Laurie Shaw [link]
- Zones on a pitch for Tableau visualisation by Rob Carroll [link]
- Alphabetic country codes [link]
:classical_building: Libaries
The Python libraries used in this repository include:
- NumPy;
- pandas;
- matplotlib;
- Plotly;
- record linkage;
- scikit-learn;
- SciPy; and
- XGBoost.
:bookmark_tabs: Resources
Getting Started with Football Analytics:
- Soccer Analytics 101 by Kevin Minkus;
- An Introduction to Soccer Analytics by John Muller - check out his Newsletter space space space;
- Sports Analytics 101 by Measureables (Brendan Kent);
- Getting into Sports Analytics and Getting into Sports Analytics 2.0 by Sam Gregory;
- Some of the useful resources in Football Analytics by @VenkyReddevil;
- Stat Glossary by Ashwin Raman;
- Football Analytics Glossary by Ashwin Raman and Mark Thompson;
- Languages and Tools to Learn for Sports Analytics by Measureables (Brendan Kent);
- Measureables (Brendan Kent)’s Twitter thread for resources for learning to code in the context of sports analytics [link]; and
- McKay John’s Twitter threads for the best resources in football analytics [link] and [link].
:student: Tutorials
Python
- Friends of Tracking YouTube channel [link] and Mathematical Modelling of Football course by Uppsala University [link]. The GitHub repo with all code featured can be found at the following [link]. Lectures of note include:
- Laurie Shaw’s Metrica Sports Tracking data series - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link];
- Lotte Bransen and Jan Van Haaren’s ‘Valuating Actions in Football’ series - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link];
- David Sumpter’s Expected Goals webinars - How to Build An Expected Goals Model 1: Data and Model, How to Build An Expected Goals Model 2: Statistical fitting, and The Ultimate Guide to Expected Goals. See the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit;
- Peter McKeever’s ‘Good practice in data visualisation’ webinar. See the following for code [link];
- Serio Llana’s step-by-step guide for creating Passing Networks [link];
- Luca Pappalardo and Paolo Cintia’s step-by-step guide to exploring the Wyscout Event data - Video 1 and Video 2. See their paper A public data set of spatio-temporal match events in soccer competitions.
- Soccer Analytics Handbook by Devin Pleuler. See tutorial notebooks (also available in Google Colab): 1. Data Extraction & Transformation, 2. Linear Regression, 3. Logistic Regression, 4. Clustering, 5. Database Population & Querying, 7. Data Visualization, 8. Non-Negative Matrix, 9. Pitch Dominance, 10. Convolutional Neural Networks;
- FC Python tutorials [link];
- DataViz, Python, and matplotlib tutorials by Peter McKeever [link] - I think his website is currently in redevelopment, with many of the old tutorials not currently available (28/02/2021). Check out his revamped How to Draw a Football Pitch tutorial;
- McKay Johns YouTube channel;
- soccer_analytics GitHub repo by CleKraus - a Python project that facilitates the starting point for analytics
- Python for Fantasy Football series by Fantasy Futopia (Thomas Whelan). This series covers the basics of working with data in Python, working with APIs and parsing StatsBomb JSON data, scraping data using Beautifulsoup and Selenium, and Machine Learning with scikit-learn and XGBoost, See GitHub repo for all code [link]; and
- Tech how-to: build your own Expected Goals model by Jan Van Haaren and SciSports. See the Bitbucket repository for all code [link].
R
- FCrSTATS tutorials [link];
- Sudarshan ‘Suds’ Golaladesikan’s R series for Friends of Tracking - Getting Started with R + StatsBomb | Analyzing Squad Rotation & Clustering Passes and creating interactive shot maps - Part 1/3), Part 2/3, (I believe no part 3 currently). See the following for code [link]; and
- Creating a pass flow graph in R by Abhishek Mishra.
Tableau
For a YouTube playlist of Tableau-football videos and tutorials that I have collated from various sources including the Tableau Football User Group, Rob Carroll, and Tom Goodall, see the following [link].
- Tableau Football User Group [link] - featuring Eva Murray, Oscar Hall, James Smith, Rob Carroll, Tom Goodall, Ravi Mistry, Adam Cook, Hannah Roberts, Chris Baker, Rusty Parker, Ruud van Elk, Johannes Riegger, and Sebastien Coustou;
- Tableau for Sport by Rob Carroll - completely free tutorials for using football data in Tableau, including creating shot maps, pass maps, pass matrxces, xG race-chart timelines. See also his YouTube playlist [link];
- Tom Goodall’s Tactics, Training & Tableau: Football Tableau User Group. Check out his Football Tableau training courses [link. Check out also as an unrolled Twitter thread, how he uses Tableau to create an opposition report for Burton vs. Gillingham on 9th January 2021 [link];
- Visually Analysing Direct Set Pieces in Football using StatsBomb Data, R and Tableau by James Smith;
- CJ Mayes’s Tableau blog, with posts including how to make a Radial Tournament Bracket;
- Tableau Tunnel series by Ninad Barbadikar. Check out his Twitter thread [link] and his YouTube channel [link];
- Medium blog posts by Sagnik Das - Tableau Guide #1: Making Shot Maps, Tableau Guide #2: Making Pass Maps, Tableau Guide #3: Convex Hulls, Tableau Guide #4 : Football Radars
- Medium blog posts by Rahul Iyer - Guide to Creating Passing Networks in Tableau , Guide to Creating Pass Sonars in Tableau;
- Creating a Shop Map by James Vaughan;
- How to create Football Pitches/Goals as Backgrounds in Tableau;
- Exporting your pass flow map to Tableau by Abhishek Mishra.
- Tableau Public profiles of note (not exhaustive by any means):
- Ashwin Raman
- Brian Prestidge
- Carlon Carpenter
- CJ Mayes
- Eva Murray
- Foot en Stats
- James Smith
- James Vaughan - see his Twitter thread of projects [link
- Mark Carey
- Matt Trevillion
- Ninad Barbadikar - see his Tableau Tunnel series
- Oscar Hall
- Paul Riley
- Peter McKeever
- Sathish Prasad V.T
- Sancho Quinn
- Rahul Iyer
- Ravi Mistry
- Rob Carroll
- Rob Suddaby
- Sushruta Nandy
Excel
- Marius Fischer’s Patreon [link]
PowerPoint
- @maramperninety’s Medium post - Yes, Powerpoint: xG Trend Line.
Other Sports
- Twitter thread by Measureables (Brendan Kent) [link]
:floppy_disk: Data Sources
All publicly available data sources and datasets relating to football, from Tracking data, Event data, aggregated player performance data, detailed match statistics, injury records and transfer values, and more.
- Awesome Football: A collection of awesome football (national teams, clubs, match schedules, players, stadiums, etc.) datasets;
- Club Elo - European club rankings;
- Data Hub Football data;
- DAVIES estimated player evaluation data by Sam Goldberg and Mike Imburgio for American Soccer Analysis;
- European Soccer Database - 25k+ matches, players & teams attributes for European Professional Football
- engsoccerdata - English and European soccer results 1871-2017;
- FBref (data provided by StatsBomb);
- FIFA 15-21 player rating data scraped from SoFIFA by Stefano Leone;
- FiveThirtyEight Club Ranking - Global Club Soccer Rankings. How 637 international club teams compare by Soccer Power Index;
- FiveThirtyEight Soccer Predictions database - football prediction data;
FootballData
- “A hodgepodge of JSON and CSV Football data”- Football-Data.co.uk - free bets and football betting, historical football results and a betting odds archive, live scores, odds comparison, betting advice and betting articles;
footballcsv
- Historical soccer results in CSV format;- football.db - A free and open public domain football database & schema for use in any (programming) language (e.g. uses plain datasets);
- Football Geek by Dinesh Vatvani (site now on hiatus);
- Football Lineups;
- Football xG;
- Guide to Football/Soccer data and APIs by Joe Kampschmidt;
- International football results from 1872 to 2020 by Mart Jürisoo;
- KPMG Football Benchmark player valuation data;
- Metrica Sports Tracking data;
- My Football Facts;
- Physio Room;
- PlusMinusData - play by play data from espn.com and sofifa.com;
- The Price of Football Master Spreadsheet - data from the finance/business aspect of football by Kieren Maguire
- Rec.Sport.Soccer Statistics Foundation - Historical league tables and football results;
- RoboCup Soccer Simulator - RoboCup Soccer Simulator Data;
- SkillCorner broadcast Tracking data;
- SofaScore - live scores, lineups, standings and basic teams, coaches and players data;
- Soccer Video and Player Position Dataset - dataset of elite soccer player movements and corresponding videos. See the accompanying paper [link;
- Squawka;
- StatsBomb Open Data - Competitions and matches (with events);
- Stat Bunker;
- Stats Perform and Centre Circle Canadian Premiere League Event data. See Google Drive [link];
- Transfer League;
- TransferMarkt;
- Twelve Football;
- wosostats - Data about women’s soccer from around the world;
- Understat shooting and meta data including player xG values. Data can be scraped using the
understatr
R package or from @NdyStats who makes this publicly available (see pinned tweet of his Twitter account for the latest version of this data); - WhoScored? (data provided by Opta); and
- Wyscout Event data for the 17/18 season for the ‘Big 5’ European leagues, Euro 2016 Chanpionship, and 2018 World Cup made available by Luca Pappalardo, Alessio Rossi, and Paolo Cintia. See their paper A public data set of spatio-temporal match events in soccer competitions.
Documentation
[TO ADD HERE]
Data Companies
Data Providers
- DataFactory
- InStat
- K-Sport
- Opta Sports
- smarterscout
- Sportlogiq
- Sport radar
- STATS PERFORM
- StatsBomb
- StrataBet (now defunct)
- TransferMarket
- understat
- WhoScored? (data provided by Opta Sports data)
- Wyscout
Tracking
- Catapult
- ChyronHego
- Metrica Sports
- Second Spectrum
- Signality
- SkillCorner
- STATS SportVU
- Kinexon
- Oliver
Video / Performance Analysis
- Analytics FC
- dataFootball
- ERIC Sports
- Futbolytics
- hudl
- LBi Dynasty
- LongoMatch
- MEDIACOACH
- nacsport
- Olocip
- SICO
- Wise
:classical_building: Libaries
Python
codeball
- data driven tactical and video analysis of soccer games;Football Packing
- a Python package to calculate packing rate for a given pass in football by Samira Kumar. This is a variation of the metric created by Impect;kloppy
- a Python package providing (de)serializers for soccer tracking- and event data, standardized data models, filters, and transformers designed to make working with different tracking- and event data like a breeze;matplotsoccer
- a Python library for visualising soccer event data by Tom Decroos;mplsoccer
- a Python library for drawing soccer/football pitches in Matplotlib and loading StatsBomb open-data by Andrew Rowlinson;nayra
- API that allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent. See the Evaluating Soccer Player paper by Paul Garnier and Théophane Gregoir;northpitch
- a Python football plotting library that sits on top of Matplotlib by Devin Pleuler;PCA_Player_Finder
by Parth Athale;PySport
includingPySport Soccer
- collection of open-source sport packages including many of those mentioned in this section, by Koen Vossen;PyWaffle
- an open source, MIT-licensed Python package for plotting waffle charts by Peter McKeever;Scrape-FBref-data
- Python library to scrape StatsBomb data via FBref by Parthe Athale, which in turn was updated from Christopher Martin’s repository;statsbombapi
- a Python API wrapper and dataclasses for Statsbomb data;statsbombpy
- a Python library written by Francisco Goitia to access StatsBomb data;statsbomb-parser
- Python library to convert StatsBomb’s JSON data into easy-to-use CSV format;socceraction
- a Python library for valuing the individual actions performed by soccer players. Includes an Expected Threat (xT) implementation by Tom Decroos et. al.;soccermix
- a soft clustering technique based on mixture models that decomposes event stream data into a number of prototypical actions of a specific type, location, and direction by Tom Deccoos and ML-KULeuven;soccer_xg
- a Python package for training and analyzing expected goals (xG) models in football;soccerplots
- a Python package that can be used for making visualizations for football analytics by Slothfulwave;sync.soccer
- a Python package to synchronise football datasets, so that an event in one dataset is matched to the corresponding event or snapshot in the other by Marek Kwiatkowski. This repository contains an implementation that aligns Opta’s (now STATS Perform’s) F24 feeds to ChyronHego’s Tracab files. More formats may be added in the future. See the following blog post for methodology [link];tmscrape
- a Python TransferMarkt webscraper by danzn1;Tyrone Mings
- a Python TransferMarkt webscraper by FCrSTATS; and
R
ggsoccer
- a soccer visualisation library in R from Ben Torvaney;worldfootballR
- a R package to allow users to extract various world football results and player statistics data from FBref and valuations and transfer data from TransferMarkt.com by Jason Zivkovic; andunderstatr
- a R package to scrape data from Understat.
GitHub Repositories
Python
analytics-handbook
by Devin PleulerExploring spatio-temporal soccer events using public event data
by Luca Pappalardo, Alessio Rossi, and Paolo Cintia. See the paper: A public data set of spatio-temporal match events in soccer competitions;expected_goals_deep_dive
by Andrew Puopolo;Expected Goals Thesis
by Andrew RowlinsonFriends-of-Tracking-Data-FoTD
;footballcsv
- Historical soccer results in CSV format;football-crunching
by Ricardo Tavares. Accompanying Medium posts [link];Google Research Football
;LaurieOnTracking
by Laurie Shaw - Python code for working with Metrica tracking data;Metrica-pitch-control
by Will Thompson - a Python implementation of Javier Fernández and Luke Bornn’s Pitch Control model from their paper Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) and Will Spearman’s Pitch Control model from his paper Beyond Expected Goals (2018). The respectively Google Colab notebooks are available [link] and [link];Pass-Flow
- create animated flow velocity fields using passing data by Open Goal App;passing-networks-in-python
- repository for building customizable passing networks with Matplotlib as part of the “Friends of Tracking” series. The code is prepared to use both eventing (StatsBomb) and tracking data (Metrica Sports);pitchly
- Python Plotly wrapper for simple football plots by Vinay Warrier;SoccermaticsForPython
- repo by David Sumpter dedicated for people getting started with Python using the concepts derived from the book Soccermaticssoccer_analytics
by CleKraus - a Python project trying to facilitate and being a starting point for analytics projects in soccer including EDA of Event data, goal kick analysis, passing analysis, xG modelling, and an introduction to Tracking data;Valuing actions in football
by Lotte Bransen and Jan Van Haaren of SciSports;
R
FoundationsInR
by Sudarshan Golaladesikan - getting started with R using the StatsBomb dataset;soccerAnimate
- an R package to create 2D animations of soccer tracking data;soccermatics
- an R package for the visualisation and analysis of soccer tracking and event data by Joe Gallagher.
Apps
- YouTubeCoder Event video tagging by FC Python;
- Statsbomb-Json-Parse by Rob Carroll. A small app that lets you input a StatsBomb JSON file and get a CSV file back (you need to create a free account to run it. For a video explainer, see the following [link];
- ALPHONSO 2.0 by Sam Goldberg and Mike Imburgio for American Soccer Analysis; and
- Soccer Analytics Library] by Lars Maurath.
Video analysis
- Over 150 video analysis videos by Carlon Carpenter - see Google Drive [link].
:page_with_curl: Papers
The following Shiny App from Lars Maurath is a great tool for looking up publications [link].
2021
- Making Offensive Play Predictable Using a GCN to Understand Defensive Performance in Socce by Paul Power, Michael Stöckl, and Thomas Seidel for Opta Pro Forum 2021. See the accomanpying talk on Vimeo [link];
- Leaving Goals on the Pitch: Evaluating Decision Making in Soccer by Maaike Van Roy, Pieter Robberechts, Wen-Chi Yang, Luc De Raedt, and Jesse Davis. See the accompanying blog post [link] and research poster [link];
- Evaluating Soccer Player: from Live Camera to Deep Reinforcement Learning (2021) by Paul Garnier and Théophane Gregoir. See the
nayra
library for code.
2020
- Automatic Pass Annotation from Soccer Video Streams based on Object Detection and LSTM (2020) by Danilo Sorano, Fabio Carrara, Paolo Cintia, Fabrizio Falchi and Luca Pappalardo;
- A Framework for the Fine-Grained Evaluation of the Instantaneous Expected Value of Soccer Possessions (2020) by Javier Fernández, Luke Bornn and Daniel Cervone;
- A new look into Off-ball Scoring Opportunity: taking into account the continuous nature of the game (2020) by Hugo M. R. Rios-Neto, Wagner Meira Jr., Pedro O. S. Vaz-de-Melo;
- Cracking the Black Box: Distilling Deep Sports Analytics (2020) by Xiangyu Sun, Jack Davis, Oliver Schulte and Guiliang Liu;
- Deep Soccer Analytics: Learning an Action-Value Function for Evaluating Soccer Players (2020) by Guiliang Liu, Yudong Luo, Oliver Schulte and Tarak Kharrat;
- Game Plan: What AI can do for Football, and What Football can do for AI (2020) by Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, Will Spearman, Tim Waskett, and Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien P´erolat, Bart De Vylder, Ali Eslami, Mark Rowland, Andrew Jaegle, Remi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, and Demis Hassabis;
- Google Research Football: A Novel Reinforcement Learning Environment (2020) by Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly. See the GitHub repo [link];
- Group Activity Detection From Trajectory and Video Data in Soccer (2020) by Ryan Sanford, Siavash Gorji, Luiz Hafemann, Bahareh Pourbabaee and Mehrsan Javan;
- Interpretable Prediction of Goals in Soccer (2020) by Tom Decroos and Jesse Davis;
- Inverse Reinforcement Learning for Team Sports: Valuing Actions and Players (2020) by Yudong Luo, Oliver Schulte and Pascal Poupart. See the code [link];
- Learning the Value of Teamwork to Form Efficient Teams (2020) by Ryan Beal, Narayan Changder, Timothy Norman, Sarvapali Ramchurn;
- Player Chemistry: Striving for a Perfectly Balanced Soccer Team (2020) by Lotte Bransen. See the accompanying Friends of Tracking video tutorials [link] and chapter 4 of the Barca Innovation Hub Football Analytics 2021 publication, titled: ‘How does context affect player performance in football?’ by Lotte Bransen, Pieter Robberechts, Jesse Davis, Tom Decroos, and Jan Van Haaren [link];
- Ready Player Run: Off-ball run identification and classification (2020) by Sam Gregory;
- The Right Place at the Right Time: Advanced Off-Ball Metrics for Exploiting an Opponent’s Spatial Weakenesses in Soccer (2020) by Sergio Llana, Pau Madrero and Javier Fernández;
- Optimising Game Tactics for Football (2020) by Ryan Beal, Georgios Chalkiadakis, Timothy Norman and Sarvapali Ramchurn;
- Routine Inspection: A Playbook for Corner Kicks (2020) by Laurie Shaw and Sudarshan ‘Suds’ Gopaladesikan. Accompanying talk - 2020 Harvard Sports Analytics Lab];
- Seeing in to the future: using self-propelled particle models to aid player decision-making in soccer (2020) by Fran Peralta, Pablo Piñones Arce, David Sumpter and Javier Fernández;
- SoccerMap: A Deep Learning Architecture for Visually-Interpretable Analysis in Soccer (2020) by Javier Fernández and Luke Bornn;
- SoccerMix: Representing Soccer Actions with Mixture Models (2020) by Tom Decroos, Maaike Van Roy and Jesse Davis;
- Soccer Analytics Meets Artificial Intelligence: Learning Value and Style from Soccer Event Stream Data (2020) by Tom Decroos
- The Tactics of Successful Attacks in Professional Association Football: Large-Scale Spatiotemporal Analysis of Dynamic Subgroups Using Position Tracking Data (2020) by Floris Goes, Michel Brink, Marije Elferink-Gemser, Matthias Kempe and Koen Lemmink
- Using Player’s Body-Orientation to Model Pass Feasibility in Soccer (2020) by Adrià Arbués-Sangüesa, Adrián Martín, Javier Fernández, Coloma Ballester and Gloria Haro;
- Valuing On-the-Ball Actions in Soccer: A Critical Comparison of xT and VAEP (2020) by Maaike Van Roy, Pieter Robberechts, Tom Decroos and Jesse Davis;
2019
- Actions Speak Louder Than Goals: Valuing Player Actions in Soccer (2019) by Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis. See accompany presentation at SIGKDD 2019 by Tom Decroos [link];
- Decomposing the Immeasurable Sport: A deep learning expected possession value framework for soccer (2019) by Javier Fernández, Bornn, and Dan Cervone. Accompanying talks - SSAC19, StatsBomb conference;
- Dynamic Analysis of Team Strategy in Professional Football (2019) by Laurie Shaw and Mark Glickman. Accompanying talks - NESSIS 2019, 2020 Google Sports Analytics Meetup;
- Measuring soccer players’ contributions to chance creation by valuing their passes (2019) by Lotte Bransen, Jan Van Haaren, and Michel van de Velden.
- Modelling the Collective Movement of Football Players (2019) by Fran Peralta; and
- Player Vectors: Characterizing Soccer Players’ Playing Style from Match Event Streams (2019) by Tom Decroos and Jesse Davis.
2018
- Beyond Expected Goals (2018) by Will Spearman;
- Chance involvement in goal scoring in football (2018) by Martin Lames
- Predicting football results using machine learning techniques (2018) by Corentin Herbinet
- Replaying the NBA (2018) by Luke Bornn
- Wide Open Spaces: A statistical technique for measuring space creation in professional soccer (2018) by Javier Fernandez and Luke Bornn;
- Spatial analysis of shots in MLS: A model for expected goals and fractal dimensionality (2018) by Alexandera Fairchild, Konstantinos Pelechrinis, Mariosa Kokkodis; and
- High-resolution shot capture reveals systematic biases and an improved method for shooter evaluation (2018) by Rachel Marty.
2017
- Physics-Based Modeling of Pass Probabilities in Soccer (2017) by Will Spearman, Austin Basye, Greg Dick, Ryan Hotovy, and Paul Pop;
- Data-Driven Ghosting using Deep Imitation Learning (2017) by Hoang M. Le, Peter Carr, Yisong Yue, and Patrick Lucey;
- Valuing passes in football using ball event data (2017) by Lotte Bransen;
- “The Leicester City Fairytale?”: Utilizing New Soccer Analytics Tools to Compare Performance in the 15/16 & 16/17 EPL Seasons (2017) by Hector Ruiz, Paul Power, Xinyu Wei, and Patrick Lucey;
- Not all passes are created equal: objectively measuring the risk and reward of passes in soccer from tracking data (2017) by Paul Power, Hector Ruiz, Xinyu Wei, and Patrick Lucey. See Paul Power’s talk [link] (downloadable MP4), and the webpage [link];
- Plus-Minus Player Ratings for Soccer (2017) by Tarak Kharrat, Javier Pena, and Ian McHale
- An examination of expected goals and shot efficiency in soccer (2017) by Alex Rathke; and
- Predicting goal probabilities for possessions in football (2017) by Nils Mackay.
2016
- Spatio-Temporal Analysis of Team Sports – A Survey (2016) by Joachim Gudmundsson and Michael Horton;
- Valuing Individual Player Involvements in Norwegian Association Football (2016) by Olav Nørstebø, Vegard Rødseth Bjertnes, and Eirik Vabo; and
- Expected Goals in Soccer (2016) by Harm Eggels.
2015
- “Quality vs Quantity”: Improved Shot Prediction in Soccer using Strategic Features from Spatiotemporal Data (2015) by Patrick Lucey, Alina Bialkowski, Mathew Monfort, Peter Carr, and Iain Matthews;
- Quantifying Shot Quality in the NBA by ; and
- Soccer video and player position dataset (2015) by S. A. Pettersen, D. Johansen, H. Johansen, V. Berg-Johansen, V. R. Gaddam, A. Mortensen, R. Langseth, C. Griwodz, H. K. Stensland, and P. Halvorsen. See the accompanying webpage [link].
2014
- Large-Scale Analysis of Soccer Matches using Spatiotemporal Tracking Data (2014) by Alina Bialkowski, Patrick Lucey, Peter Carr, Yisong Yue, Sridha Sridharan, and Iain Matthews.
2011
- A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains (2011) by Sarah Rudd. Accompanying NESSIS talk on Metacafe [link]; and
- An Extension of the Pythagorean Expectation for Association Football (2011) by Howard Hamilton.
1997
- Modelling Association Football Scores and Inefficiencies in the Football Betting Market (1997) by Mark Dixon and Stuart Coles.
:books: Written Pieces
Highly Rated and Recommended Pieces
Many of these blog posts are recommended in Sam Gregory’s Best Football Analytics Pieces piece and Tom Worville’s “What’s the best Football Analytics piece you’ve ever read?”.
- Assessing The Performance of Premier League Goalscorers by Sam Green;
- Counting Across Borders by Ben Torvaney by John Muller;
- Is Soccer Wrong About Long Shots?
- Defending Your Patch by Thom Lawrence;
- The DePO Models: Bringing Moneyball to Professional Soccer by Sam Goldberg and Mike Imburgio;
- Using Data to Analyse Team Formations by Laurie Shaw;
- Structure in football: putting formations into context by Laurie Shaw;
- Inside Arsenal’s Attack: In-Depth Analysis Of Arteta’s Problems & Possible Solutions by Ashwin Raman;
- Premier League Projections and New Expected Goals by Michael Caley;
- Introducing Passing Combinations by Piotr Wawrzynów;
- Pass Footedness in the Premier League by James Yorke;
- Messi Walks Better Than Most Players Run by Bobby Gardiner;
- Soccer Analytics 101 by Kevin Minkus;
- An Introduction to Soccer Analytics by John Muller;
- Valuing On-the-Ball Actions in Soccer: A Critical Comparison of xT and VAEP by Jesse Davis, Tom Decroos, Pieter Robberechts, Maaike Van Roy;
- Game of Throw-Ins by Eliot McKinley;
- Expected Threat by Karun Singh. Check out also as an unrolled Twitter thread [link] Karun’s Twitter thread for the many resources out there around this topic, including: Episode 19 of The Football Fanalytics Podcast, Karun’s StatsBomb conference presentation [link] and slides [link], Rob Hickman’s StatsBomb conference presentation where he extended xT to take defensive risk into account [link], Last Row View (Ricardo Tavares)’s blog post for evaluating off-the-ball player movements by combining xT and tracking data, and Karun’s xT values as a 12x8 grid to download as a JSON file [link];
- Lionel Messi’s ten stages of greatness by Michael Cox and Tom Worville;
- Passing Out at the Back by Will Gürpinar-Morgan;
- The 10 Commandments of Football Analytics by Tom Worville;
- Borussia Dortmund - What’s gone wrong? by Colin Trainor for StatsBomb;
- Breaking Down Set Pieces: Picks, Packs, Stacks and More by Euan Dewar;
- Data Based Coaching: How to Incorporate Data-Driven Decision into Your Coaching Workflow by Kieran Doyle; and
- Coaches Reward Goalscorers. But Should They? by Eliot McKinley and John Muller.
:pencil2: Blogs and Data Analytics Websites
- 11tegen11 by 11tegen (Sander IJtsma];
- 21st Club - blog posts available in hard-copy form in their Changing the Conversation series;
- 2+2=11 by Will Gürpinar-Morgan;
- 5 Added Minutes by Omar Chaudhuri (last updated 03/09/2016);
- 8 Yards 8 Feet by Simon Lock;
- All Things Football;
- Absolute Unit;
- American Soccer Analysis;
- Analyse Football by Ravi Ramineni (last updated 06//04/2015);
- Analytics FC;
- Attacking Center-back by JP Quinn;
- Barça Innovation Hub;
- Brendan Kent. Check out his Sports Analytics 101 series;
- Carey Analytics by Mark Carey;
- DeepxG by Thom Lawrence (last updated 29/11/2017);
- Differentgame by Paul Riley;
- DTAI Sports Analytics Lab by KU Leuven;
- The Economics of Sport;
- EFL Numbers by EFL Numbers;
- EightyFivePoints by Laurie Shaw;
- Experimental 361 by Ben Mayhew;
- FC Python by FC Python;
- FiveThirtyEight Sports;
- Football Crunching by Ricardo Tavares;
- Football Data Science by Dr. Garry Gelade;
- Football Philosophy by Joost van der Leij;
- Football Science by Michael C. Rumpf;
- Football Whispers;
- The Futebolist by Ashwin Raman;
- Get Goalside!;
- The Harvard Sports Analysis Collective;
- Hockey Graphs;
- Hudl;
- James W Grayson by James W Grayson;
- Jan Van Haaren by Jan Van Haaren;
- jogall.github.io by Joe Gallagher;
- Karun Singh by Karun Singh;
- kubamichalczyk.github.io by Kuba Michalczyk
- kwiatkowski.io by Marek Kwiatkowski;
- lufcdata by @LUFCDATA;
- LukeBornn.com by Luke Bornn;
- Mackay Analytics by Nils Mackay;
- Mackinaw Stats by Mackinaw Stats;
- Mark’s Notebook by Mark Thompson;
- MRKT Insights with Tim Keech, Ram Srinivas, Matt Lawrence, Kevin Elphick, and Andy McGregor. Formally Jay Socik;;
- Ninad Barbadikar Medium blog by Ninad Barbadikar;
- North Yard Analytics by Dan Altman;
- openGoal by Charles William;
- Opta Pro - old blogs removed but can be found using Wayback Machine;
- patricklucey.com by Patrick Lucey;
- Penal.lt/y by Martin Eastwood;
- Piotr Wawrzynów – Football Analysis by Piotr Wawrzynów;
- Proform AFC by Proform Analytics (Mladen Sormaz and Dan Nichol);
- Ravi Mistry’s Medium blog;
- robert-hickman.eu;
- SaddlersStats;
- Sam Gregory Medium blog;
- SciSports;
- Soccermatics Medium blog by David Sumpter;
- soccerNurds;
- space space space;
- StatDNA (last updated 01/06/2011 before Arsenal bought the company);
- StatsBomb;
- Stats Perform;
- Stats and snakeoil by Ben Torvaney;
- The Analyst by Stats Perform;
- The Last Man Analytics by The Last Man Anayltics (Ciaran Grant);
- The Power of Goals;
- Training Ground Guru. Check out their accompanying podcast [link];
- Tom Worville Medium blog by Tom Worville (last updated 14/08/2017). Tom now writes for The Athletic [link];
- winningwithanalytics.com by Bill Gerrard;
- Wooly Jumpers for Goal Posts by The Woolster;
- Wyscout;
- xG per Shot by Parthe Athale; and
- Zonal Marking. by Michael Cox. Michael now writes for The Athletic [link].
Newsletters
- 21st Club;
- Absolute Unit;
- Get Goalside!;
- geom_mark;
- Grace on Football by Grace Robertson;
- Looks Good on Paper by Felix Pate;
- Measureables by Brendan Kent;
- No Grass in the Clouds;
- Soccer Analytics Newsletter;
- space space space by John Muller; and
- Stats Perform.
:newspaper: News Articles
- Kevin De Bruyne uses data analysts to broker £83m Man City contract without agent (08/04/2021) by David McDonnell for The Mirror;
- La extraña renovación de De Bruyne: sin agente y usando el ‘big data’ para calcular su salario (07/04/2021) for Marca;
- From scouting players on sidelines to sofas – Meet the WyScout generation transforming football analytics (07/04/2021) by Pete Hall for iNews;
- Meet Ram Srinivas, The Biggest Wes Hoolahan Fanatic In India (27/03/2021) by Fiachra Gallagher for Balls.ie;
- Soccer-From blogging to the dressing room - the rise of the new analysts (25/03/2021) by Simon Evans for Reuters
- Premier League club Manchester City hire astrophysicists (24/03/2021) by Alfredo Relaño for AS;
- Manchester City will have astrophysicists in their ranks in Marca;
- It IS rocket science! Manchester City hire astrophysicists to their data analysis team in bid to move Premier League leaders further ahead of their rivals by Jack Gaughan (22/03/2021) for The Daily Mail;
- Liverpool sign up for StatsBomb 360: Ted Knutson explains why this stats revolution will change the game (18/03/2021) by Adam Bate for Sky Sports News;
- Data experts are becoming football’s best signings (05/03/2021) by Justin Harper for BBC News;
- How a Celtic blogger nurtured by Brendan Rodgers is now lifting Leicester City (27/02/2021) by Tom Roddy for The Times;
- 17-Year-Old Man Lands Dream Job Of Getting Paid To Watch Football All Day by Adnan Riaz for Sport Bible;
- Aged 17 and getting paid to watch football all day (04/02/2021) by Manish Pandey for BBC News;
- Man City’s Big Winter Signing Is a Former Hedge Fund Brain (31/01/2021) by David Dellier and Adam Blenford for Bloomberg;
- How data is pushing Twitter scouts and bloggers into football’s big time (27/02/2021) by Paul MacInnes for The Guardian;
- Revealed: expected goals being used in football’s war against match-fixing (13/02/2021) by Sean Ingle for The Guardian;
- ‘What we do isn’t rocket science’: how Midtjylland started football’s data revolution (25/10/2020) by Sean Ingle for The Guardian;
- How a teenager from Bangalore became a performance analyst for Dundee United (23/12/2020) by Tim Wigmore for The Telegraph;
- How the volunteers of data website Transfermarkt became influential players at European top football clubs (18/12/2020) by Pepihn Keppel and Tom Claessens;
- Colin Trainor: from bigging up Klopp to the little details of the GAA (17/10/2020) by Kenny Archer for The Irish Times;
- REVEALED: The data scientist, astrophysicist, chess champion, and doctor in theoretical physics who are behind Liverpool’s title-winning success… they may look a ‘little nerdy’ but this Fab Four prove it is rocket science! (27/06/2020) by Rob Draper and Adam Shafiq for The Daily Mail;
- How analysts have used lockdown to unearth football’s next hidden gems (17/07/2020) by Dan Clark in The Times;
- Behind the Badge: The physicist who leads Liverpool’s data department (15/06/2020) by Sam Williams for LiverpoolFC.com;
- How Soccer Scouting Has Changed, And Why It’s Never Going Back (15/05/2020) by Robert Kidd for Forbes;
- ‘Expected threat’, ‘width per sequence’ – the statistical metrics you haven’t heard of (13/02/2020) by Dan Clark for The Times;
- How Brentford flipped the script and staged a data revolution to become England’s smartest club (24/01/2020) by Sean Ingle for Talksport;
- ‘It’s the boffins what won it!': Data experts plus Jurgen Klopp’s charisma turn Liverpool into the kings of Europe (02/06/2019) by Joe Bernstein for The Mail on Sunday;
- How Data (and Some Breathtaking Soccer) Brought Liverpool to the Cusp of Glory (22/05/2019) by Bruce Schoenfeld for The New York Times;
- Brexit Could Drastically Change English Soccer (11/12/2018) by Laurie Shaw for FiveThirtyEight;
- Soccer’s Moneyball Moment: How Enhanced Analytics Are Changing The Game (19/11/2018) by Robert Kidd for Forbes;
- 2018 World Cup: Prediction Time; Up Against The Machine (13/06/2018) by Bobby McMahon for Forbes;
- Home advantage, unconscious bias and the boisterous crowds who influence referees (23/04/2018) by Tim Wigmore for iNews;
- The Premier League is losing its competitive balance – that should be cause for concern (02/02/2018) by Tim Wigmore for iNews;
- Expected goals and Big Football Data: the statistics revolution that is here to stay (03/03/2017) by Paul MacInnes in The Guardian;
- How computer analysts took over at Britain’s top football clubs (09/03/2014) by Tim Lewis for The Observer;
- How data analysis helps football clubs make better signings (01/11/2018) by John Burn-Murdoch for The FT;
- A football revolution (17/07/2011) in The FT [pay wall]; and
- A working life: The quantitative analyst (11/06/2011) by Graham Snowdon for The Guardian.
:vhs: Videos
For a YouTube playlist of over 800 Sports Analytics / Data Science videos that I have collated into one single playlist, originally for my own viewing but it may be useful to you, see [link]. For Football-specific Data Science lectures and seminars, see [link]. For a Tableau Football specific playlist, see [link].
:man_teacher: Webinars and Lectures
- Laurie Shaw’s Metrica Sports Tracking data series for #FoT - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link];
- Lotte Bransen and Jan Van Haaren’s ‘Valuating Actions in Football’ series for #FoT - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link];
- David Sumpter’s Expected Goals webinars for #FoT - How to Build An Expected Goals Model 1: Data and Model, How to Build An Expected Goals Model 2: Statistical fitting, and The Ultimate Guide to Expected Goals. See the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit;
- Peter McKeever’s ‘Good practice in data visualisation’ webinar for #FoT. See the following for code [link];
- StatsPerform AI in Sport series - Overview, AI in Basketball, AI In Soccer, and AI in Tennis;
- Making Offensive Play Predictable by Paul Power, Michael Stöckl, and Thomas Seidel for Opta Pro Forum 2021;
- Google Research Football by Piotr Stanczyk;
- Will Spearman’s masterclass in Pitch Control for Friends of Tracking;
- How Tracking Data is Used in Football and What are the Future Challenges with Javier Fernández, Sudarshan ‘Suds’ Gopaladesikan, Laurie Shaw, Will Spearman and David Sumpter for Friends of Tracking;
- Why Do Clubs Need to Embrace Analytics to Stay Competitive? with Vosse de Boode, David Sumpter, Adrien Tarascon and Javier Fernández for Barca Innovation Hub;
- Valuing Actions in Football: Introduction with Lotte Bransen and Jan Van Haaren for Friends of Tracking;
- Routine Inspection: Measuring Playbooks for Corner Kicks by Laurie Shaw and Sudarshan ‘Suds’ Gopaladsikan;
- Tactical Insight Through Team Personas by David Perdomo Meza and Daniel Girela. See accompanying blog post [link];
- Training Ground Guru webinairs
- Christmas Lectures 2019: How to Get Lucky with Hannah Fry. Small segment with Tim Waskett @ 27mins;
- I’m in a Wide Open Space: Creating Opportunities at Set Pieces by Dan Barnett;
- Long or Short? How the New Short Goal Kick Rule Is Impacting Football by Tom Worville;
- Learning to Watch Football: Self-Supervised Representations for Tracking Data by Karun Singh. See accompanying blog post [link];
- Identifying and Evaluating Strategies to Break down a Low Block Defencehttps://vimeo.com/404694721/21fa93ada1 by Vignesh Jayanth. See accompanying blog post [link];
- Seeing in to the Future: Modelling Football Player Movements by David Sumpter;
- Learning Value and Style from Soccer Event Stream Data by Tom Decroo;
- Marcelo Bielsa’s infamous ‘Spygate PowerPoint presentation of Derby County [link];
- Tom Goodall’s Tactics, Training & Tableau: Football Tableau User Group. Check out his Football Tableau training courses [link;
- Data Robot Opening Remarks & Keynote: Making Better Decisions, Faster with Brian Prestidge;
- A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains by Sarah Rudd. Accompanying slides [link];
- Demystifying Tracking data Sportlogiq webinar by Sam Gregory and Devin Pleuler;
- Data Analytics in Soccer by Dan Fradley;
- How Hammarby create the mathematically perfect pressing game by David Sumpter
- Hudl Presents: Performance Analysis in 2020
- Self-Supervised Representations for Tracking Data by Karun Singh;
- An American Analyst in London at SSAC 2019 with StatsBomb CEO Ted Knutson and Houston Rockets GM Daryl Morey;
- Beyond the Baseline by Marek Kwiatkowski;
- Some Things Aren’t Shots by Thom Lawrence;
- Beyond Save Percentage by Derrick Yam
- Expected goals demonstration by Sander Ijtsma
- Goals change games by Garry Gelade
- Expected goals by Dan Altman
Ted Talks
- What Football Analytics can Teach Successful Organisation by Rasmus Ankersen;
- Soccermatics: how maths explains football by David Sumpter
- Changing the soccer transfer market with big data by Giels Brouwer
Documentaries
- The Numbers Game: How Data Is Changing Football - FourFourTwo Documentary;
- How Stats Won Football: From Moneyball to FC Midtjylland – COPA90 Stories Documentary;
Match Highlights
- Footballia - historical matches and highlights
Others
:tv: YouTube Channels
- Friends of Tracking with David Sumpter, Javier Fernández, Laurie Shaw, Sudarshan ‘Suds’ Gopaladesikan, Pascal Bauer, and Fran Peralta;
- McKay Johns;
- Barça Innovation Hub (English and Spanish);
- Mark Glickman – for NESSIS talks, uploaded to his personal channel. Old talks are available on his Metacafe channel. See the official website [link];
- 42 Analytics – for SSAC conferences;
- CMU Statistics;
- StatsBomb;
- Opta - including Opta Pro Forum talks;
- STATS Insights;
- Tifo Football;
- Football Whispers;
- Football Player Ratings by Lars Magnus Hvattum;
- The Coaches’ Voice; and
- Ninad Barbadikar’s YouTube channel [link].
:books: Books
- Moneyball: The Art of Winning an Unfair Game by Michael Lewis;
- The Numbers Game by Chris Anderson and David Sally;
- Football Hackers by Christoph Biermann;
- Soccermatics by David Sumpter;
- Soccernomics by Simon Kuper and Stefan Szymanski;
- Money and Football: A Soccernomics Guide by Simon Kuper and Stefan Szymanski;
- Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football by Wayne Winston;
- Data Analytics in Football by Daniel Memmert and Dominik Raabe;
- Changing the Conversation series by 21st Club;
- Football Decoded: Using Match Analysis & Context to Interpret the Demands by Paul Bradley;
- Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers by Ben Alamar;
- Outside the Box by Duncan Alexander;
- [Opta World Football Infographics: The Beautiful Game in Brilliant Detail]https://www.amazon.co.uk/World-Football-Infographics-Adrian-Besley/dp/1780977727() by Adrian Besley;
- Zonal Marking: The Making of Modern European Football by Michael Cox;
- The Mixer: The Story of Premier League Tactics, from Route One to False Nines by Michael Cox;
- Inverting the Pyramid by Jonathan Wilson;
- Sprawlball: A Visual Tour of the New Era of the NBA by Kirk Goldsberry; and
- Numbers Don’t Lie: New Adventures in Counting and What Counts in Basketball Analytics by Yago Colás.
Magazines:
:loud_sound: Podcasts
Spotify and YouTube links used where available.
Football Analytics Podcasts
- All Stats Aren’t We with Jon Mackenzie and Josh Hobbs (Leeds United Podcast)
- American Soccer Analysis;
- Analytics FC Podcast;
- Big Data Sports (Spanish) by Marcelo Gantman and Agustin Mario Gimenez;
- The Dan & Omar Show with Daniel Geey and Omar Chaudhuri
- Double Pivot Podcast;
- Differentgame - The Football Analytics Podcast by Paul Riley and Richard Shephard;
- Expected Value;
- Fanalytics with Mike Lewis;
- First Time Finish Podcast with Tom Underhill, Bence Bocsak, and Ninad Barbadikar;
- The Football Fanalytics Podacst;
- Football Today;
- Laptop Gurus;
- Looks Good on Paper podcast by Felix Pate;
- MRKT Insights with Tim Keech, Ram Srinivas, Matt Lawrence, Kevin Elphick, and Andy McGregor. Formally Jay Socik;
- Measurables Podcast by Brendan Kent;
- Open Source Sports with Ron Yurko;
- The Scouted Football Podcast;
- smarterscout: The Why in Analytics by Dan Altman;
- Squawka Talker Football Podcast;
- StatsBomb;
- The SV Podcast;
- Target Scouting by Luke Griffin;
- Tifo Podcast;
- Training Ground Guru;
- Three At The Back by Opta Pro; and
- Zonal Marking with Michael Cox, Tom Worville and Ali Maxwell.
Noteable Episodes (including non-football-data-specific podcasts)
- All Stats Aren’t We:
- Analytics FC Podcast:
- Big Data Sports (Spanish) by Marcelo Gantman and Agustin Mario Gimenez:
- 87: No es Moneyball: es Brentford
- 66: Tres Libros Sobre Sports Analytics Más Allá De Moneyball
- 65: Métrica Sports: La máquina de entender el juego withg Bruno Dagnino
- 56: STATS PERFORM: Cómo es el nuevo gigante de los datos del fútbol
- 47: Wyscout: 550 Mil Futbolistas “concentrados” En Un Software
- 35: Big Data Sports - 35: Analistas: Los nuevos “cracks” del fútbol
- 33: Google + IA = Fútbol en Real Time
- Challengers Podcast:
- Expected goals (2016)
- The Conor J Show:
- The Derby County BlogCast
- January window preview with Ram Srinivas (MRKT Insights)
- Expected Value
- Explore Explain with Andy Kirk:
- Fanalytics with Mike Lewis:
- Getting Your Foot in the Door with Sean Steffen
- Freakonomics by Stephen J. Dubner:
- Can Britain Get Its “Great” Back? (Ep. 393) featuring Dr. Ian Graham @ 41m25s;
- Football CFB Podcast:
- The Football Collective Podcast:
- The Football Pod:
- Football Today
- Manchester City Enters Data Arms Race With Liverpool
- How Safe is Football’s Data
- Who Owns Football’s Data
- Can Data Save Manchester United? Featuring Tom Worville
- Tony Bloom: The Betting Guru Running Brighton
- Modern Soccer Coach Podcast with Gary Curneen:
- Not The Top 20 Podcast:
- The Nutmegged Arena by The Nutmeg Assist:
- Open Source Sports with Ron Yurko;
- Player Chemistry in Soccer with Lotte Bransen
- The Ornstein & Chapman Podcast with David Ornstein and Mark Chapman:
- Pacey Performance Podcast with Robert Pacey:
- #340 What is data science (and what isn’t), data informed decision making with Sudarshan Golaladesikan - Spotify and YouTube;
- The PinkUn Norwich City Podcast:
- Pinnacle Podcast:
- The Process with James Allcott:
- The Scouted Football Podcast:
- Soccer Player Development Podcast:
- Episode 12 with Rasmus Ankersen - YouTube
- Squawka Talker Football Podcast:
- Tifo Podcast:
- The Transfer Market & 21st Club withj Omar Chaudhuri - Spotify and YouTube
- How Memphis Depay Used Data to Find His Next Club with Giels Brouwer - Spotify and YouTube
- How Do Football Clubs Actually Use Statistics? - YouTube
- JJ Bull: Tactical Analysis & Coaching Badges - Spotify and YouTube
- A Day in the Life Of: A Football Recruitment Analyst - Spotify and YouTube
- Liverpool: Pressing, xG Concerns, and Klopp’s Future - Spotify and YouTube
- Understanding Stats in Football with Nikos Overheul - Spotify and YouTube
- Steve Morison: Tactical Insight & Football Psychology - Spotify and YouTube
- Football Tactics with Michael Cox (Zonal Marking) - Spotify and YouTube
- Football, Tactics & History with Jonathan Wilson - Spotify and YouTube
- The Future of Stats: xG, xA - Spotify and YouTube
- The Totally Football Show with James Richardson
- 03/07/2019: Football Hackers with Christoph Biermann
- Total Soccer Show:
- Soccer stats and analytics with Ted Knutson (@mixedknuts) (in which Ted explains Expected Goals to Daryl) - YouTube
- Mike L. Goodman (@TheM_L_G) talks USMNT tactical options, EPL trends, Expected Goals - YouTube
- Everton Premier League preview: Mike L. Goodman (@TheM_L_G) talks Silva’s style, Moise Kean, and replacing - YouTube
- Trademate Sports:
- UCN/USF Sport Management - Sports Business Podcast:
- Where Others Won’t by Cody Royle:
:man_technologist: Notable Figures / Twitter Accounts
- 2020 Analytics Twitter Top 1,000 Power Rankings, calculated by Will Thomson. See the Twitter list created by Luton Town Analytics [link];
- Sports Analytics Twitter list by Jan Van Haaren;
- Soccer People by John Muller;
- Football Analysts Twitter list by Colin Trainor;
- Opta Staff Twitter list by Opta;
- Football Analyst Community Rankings dashboard by Neil Charles; and
- Football data Analysts spreadsheet by Dan Altman (few years old now but lists the OGs of football analytics).
Career Advice
- Getting into Sports Analytics by Sam Gregory;
- Getting into Sports Analytics 2.0 by Sam Gregory;
- Best Football Analytics Pieces by Sam Gregory;
- How to become a football data scientist – Friends of Tracking with Pascal Bauer, Javier Fernández, Suds Gopaladesikan, Fran Peralta, and David Sumpter;
- HANIC Panel “How to get into Sports Analytics & Media + Analytics” with Alison Lukan, Sarah Bailey, Harman Dayal, Asmae Toumi Mike Johnson, Alison Lukan;
- You Want to be a Performance Analyst? by The Video Analyst;
- What do you need to learn to work in football analytics? by David Sumpter for Barca Innovation Hub;
- Careers in Sports Analytics;
- Fanalytics podcast with Mike Lewis - Getting Your Foot in the Door with Sean Steffen;
- Tom Worville Twitter thread; and
- Will Spearman’s Twitter thread.
:spiral_calendar: Events and Conferences
- OptaPro Analytics Forum;
- StatsBomb Conference;
- Barça Sports Tomorrow, Sports Analytics Summit, and Sports Technology Symposium;
- MIT Sloan Sports Analytics Conference;
- New England Symposium on Statistics in Sports (NESSIS;
- Carnegie Mellon Sports Analytics Conference;
- CASSIS;
- Tactical Insights 2020 Conference at King Power Stadium;
- Workshop on Artificial Intelligence in Team Sports (AITS);
- Workshop on Machine Learning and Data Mining for Sports Analytics;
- International Workshop on Computer Vision in Sports;
- Google Sports Analytics Meetup.;
- DFB Hackathon;
- PSG Sports Analytics Challenge;
- Football Data International Forum;
- Global Training Camp;
- Great Lakes Analytics Conference;
- MathSport International;
- Sports Analytics World Series; and
- Sportdata & Performance Forum.
Competitions
Includes non-football competitions.
- NFL Big Data Bowl (American Football) - 2021 - annual;
- Big Data Cup (Hockey) - annual;
- Google Research Football with Manchester City F.C. - October 2020; and
- Liverpool Analytics Challenge (Football) - May 2020. Challenge used Last Row Tracking-like data kindly provided by Ricardo Tavares. Full a full list of entries, see David Sumpter’s Medium post [link], featuring the three eventual winners - Surya Kocherlakota, Theophane Gregoir and Paul Garnier’s, and Gabin Rolland (discussed on #FoT [link]).
Courses
- Mathematical Modelling of Football by Uppsala University;
- StatsBomb Academy;
- Sport Analytics and Technologies MSc at Loughborough University, taught by Donald Barron;
- Football Analytics short course by StatsPerform with Birkbeck University; and
- Barça Innovation Hub.
:briefcase: Jobs
- The Video Analyst - Rob Carroll posts many of the jobs going in football on his own website. Make sure to also follow him on Twitter (@thevideoanalyst);
- City Football Insights:
- Opta;
- Stats Perform Job Opportunities and link
- Statsbomb;
- Wyscout and careers@wyscout.com;
- Hudl;
- Metrica Sports;
- Second Spectrum;
- SciSports;
- Football Radar;
- Genius Sports and link;
- Gracenote and link;
- Global Sports;
- Smart Odds; and
- FutbolJobs.
:key: Key Concepts
References to resources organised by topic.
Expected Goals (xG) Modeling
Videos
For a playlist of Expected Goals related videos available on YouTube, see the following playlist I have created [link].
- What is xG? by Tifo Football;
- Opta Expected Goals by The Analyst (formally Opta);
- What are Expected Goals? by David Sumpter and Axel Pershagen;
- Anatomy of a Goal by Numberphile Brady Haran);
- How Did These Goals Go In? - We Explain How Goal Probability Works by the Bundesliga;
- Soccer Analytics: Expected Goals by Dan Altman; and
- Anatomy of an Expected Goal by 11tegen (Sander IJtsma);
Webinars and Lectures
- David Sumpter’s Expected Goals webinars for #FoT (see the following for code 3xGModel, 4LinearRegression, 5xGModelFit.py, and 6MeasuresOfFit):
- “Is Our Model Learning What We Think It Is?” Estimating the xG Impact of Actions in Football by Tom Decroos from the 2019 StatsBomb Innovation in Football Conference;
- Statsbomb Data Launch - Beyond Naive xG by Ted Knutson.
Tutorials
- Tech how-to: build your own Expected Goals model by Jan Van Haaren and SciSports.
- Fitting your own football xG model by Dato Fútbol (Ismael Gómez Schmidt). See GitHub repo [link];
- Python for Fantasy Football series by Fantasy Futopia (Thomas Whelan). See the following posts:
- Building an Expected Goals Model in Python by Peter McKeever (using WayBackMachine);
- An xG Model for Everyone in 20 minutes (ish) by Football Fact Man (Paul Riley).
Notable Models
Written pieces
For a collated list of Expected Goals literature collated by Keith Lyons, see the following [link]
- xG explained by FBref;
- What are expected Goals? by American Soccer Analysis;
- David Sumpter’s Expected Goals pieces:
- Michael Caley’s Expected Goals pieces:
- Jesse Davis and Pieter Robberechts’ Expected Goals pieces for KU Leuven;
- Unexpected goals Will Gürpinar-Morgan;
- Great Expectations by Will Gürpinar-Morgan;
- On single match expected goal totals by 2+2=11 (Will Gürpinar-Morgan]);
- Martin Eastwoood (Pena.lt/y)’s Expected Goals pieces [link];
- Expected Goals For All.
- Actual Goals Versus Expected Goals;
- Expected Goals Updated;
- Expected Goals: The Y Axis;
- Expected Goals And Exponential Decay;
- Expected Goals: Foot Shots Versus Headers;
- Expected Goals And Support Vector Machines;
- Expected Goals and Uncertainty; and
- Sharing xG Using Multi-touch Attribution Modelling.
- Garry Gelade’s Expected Goals pieces:
- Expected Goals and Unexpected Goals (using WayBackMachine);
- Assessing Expected Goals Models. Part 1: Shots (using WayBackMachine);
- Assessing Expected Goals Models. Part 2: Anatomy of a Big Chance (using WayBackMachine);
- How StatsBomb Data Helps Measure Counter-Pressing by Will Gürpinar-Morgan
- Introducing xGChain and xGBuildup by Thom Lawrence;
- Quantifying finishing skill by Marek Kwiatkowski;
- The Dual Life of Expected Goals (Part 1) by Mike L. Goodman;
- A close look at my new Expected Goals Model by by 11tegen (Sander IJtsma] (using WayBackMachine);
- An analysis of different expected goals models by Benjamin Cronin;
- Expected Goals 3.0 Methodology by Matthias Kullowatz;
- Explaining and Training Shot Quality by Ted Knutson;
- A simple Expected Goals model by Cricket Savant;
- How we calculate Expected Goals (xG) by Fantasy Football Fix; and
- Una mirada al Soccer Analytics usando R — Parte III by Dato Fútbol (Ismael Gómez Schmidt).
Libraries
soccer-xg
by Jesse Davis and Pieter Robberechts.
GitHub repos
Expected Goals Thesis
by Andrew Rowlinson. See both his thesis [link] and the following notebooks:expected_goals_deep_dive
by Andrew Puopolo. See the following notebooks:soccer_analytics
by Kraus Clemens. See the following notebooks:xg-model
] by Dato Fútbol (Ismael Gómez Schmidt)
Podcasts
- The Future of Stats: xG, xA - Spotify and YouTube by Tifo Podcast; and
- #56: Dominic Calvert-Lewin & Explaining Expected Goals - Spotify and YouTube by The Scouted Football Podcast.
- #1: What Did You Expect? - Spotify by The Football Fanalytics Podcast
Tweets
- The benefits of including fake data in an Expected Goals model [link].
Tracking data
- Laurie Shaw’s Metrica Sports Tracking data series for #FoT - Introduction, Measuring Physical Performance, Pitch Control modelling, and Valuing Actions. See the following for code [link];
Possession Value (PV) Frameworks
Expected Threat (xT)
[TO ADD]
Valuing Actions by Estimating Probabilities (VAEP)
- Lotte Bransen and Jan Van Haaren’s ‘Valuating Actions in Football’ series for #FoT - Valuing Actions in Football: Introduction, Valuing Actions in Football 1: From Wyscout Data to Rating Players, Valuing Actions in Football 2: Generating Features, Valuing Actions in Football 3: Training Machine Learning Models, and Valuing Actions in Football 4: Analyzing Models and Results. See the following for code [link];
Goals Added (g+)
Player Similarity Analysis
[TO ADD]
Player Comparison and Similarity Analysis
[TO ADD]
Reinforcement Learning for Football Simulation
- Google Research Football: A Novel Reinforcement Learning Environment (2020) by Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly;
Google Research Football
GitHub repo;- Google Research Football with Manchester City F.C. Kaggle Competition (ended October 2020)
- Karol Kurach - Google Research Football
- Karol Kurach (Google Brain) “Google Research Football: Learning to Play Football with Deep RL
- Google Research Football by Piotr Stanczyk;
- Google’s AI Plays Football…For Science! by Two Minute Papers
:grey_question: Miscellaneous
- Club crests available to download, put together by Ninad Barbadikar;
- Pitch templates, put together by Tony Bambrick (see tweet [link]);
- Association of Sports Analytics Professionals;
- A collated list of Expected Goals literature collated by Keith Lyons;
- Expected Goal literature by ;
- FIFA EPTS (Electronic Performance and Tracking Systems);
- opensport (Google Group); and
- Technical Report - 2018 FIFA World Cup.
Credits
Credits to the Soccer Analytics Handbook by Devin Pleuler, the Awesome Soccer Analytics by Matias Mascioto, and Jan Van Haaren’s Soccer Analytics 2020 Review which were all used to plug gaps in the list once it was published.