Big data for great Football

Photo by Zhu Hongzhi on Unsplash

During the covid-19 pandemic, my favorite sport started living a crisis. I decided to dedicate this article to Football aka Soccer for some of us.

I try to follow the lives/replays as much as possible, and I am always mesmerized by some of the reports I see on sports TV channels.

So here I am, sharing with you my passion for Football and Technology.

Big needs

As I mentioned earlier, beautiful reporting in sports channels is one of the candidates for consuming data ‘big time’.

To have added value and captivate the attention of the viewers, the media uses statistics in commentaries or during shows. Moreover, you could see ‘wow’ing graphics on complex strategy matters displayed and analyzed live.

Another need is at clubs level. Many big teams have there own exploitation of data, for the sake of being the most competitive. They monitor and analyze players’ performance in training or during games for instance using big data techniques (which we will discuss further).

Here is a report on one of my favorite players in the moroccan national team,

taken from Fifa World Cup 2018 focus — Amine Harit, made by Opta sports which is one of the pioneers in sports data.

Finally, I have to not forget everything related to simulation games, a.k.a fantasy football. Companies need to make scoring systems, and gamers have to study various information types to be good managers for instance.

Big needs need big feeds

I know, too many rhymes in the title, sorry about that, it was too tempting. But seriously, the data sources are tremendously rich and innovative, let’s check some of the existing systems that feed us with tasty football data.

EPTS, or in a more impressive way “Electronic Performance and Tracking Systems”.

“EPTS primarily track player (and ball) positions but can also be used in combination with microelectromechanical devices (accelerometers, gyroscopes, etc.) and heart-rate monitors as well as other devices to measure load or physiological parameters.​​” — FIFA

More concretely there are three types of devices that catch physical data:

  • Optical-based camera systems
  • Local positioning systems (LPS)
  • GPS/GNSS systems

Moreover, I could add Adidas’ MiCoach, which was a huge factor in the success of the German team, in the 2014 Brazil world cup; It is worn by the athletes and gives data including heart rate, distance, speed, acceleration and power, to the coaches and sport scientists for analysis.

GPSport and Catapult Sports are also big players in the business, they have sophisticated wearable tracking devices that were used by Chelsea, Real Madrid and Brazilian national team for instance.

Analytics in action

​​There are many facets to data analytics that could be used efficiently by consumers.

Advanced metrics

For example, advanced metrics is one way to either forecast or measure performance in a football game or a season, for a player or a team. It helps coaches for instance analyze the optimal performance their team or players should have or should have had.

Probabilities of goals or assists are metrics about the likelihood of a scoring shot or assist depending on the strike angle, the pass type, the distance to goal, the length of the pass …

For TV channels as an example, there are data representations in the form of polygonal areas of defensive coverage on a field, they are computed after combining and analyzing the coordinates of a defender or a defensive bloc

Augmented data

Some examples of augmented data in the following visualizations

infographic combining events with tracking data to give insight on an action
infographic for analyzing player movements and decision-making

Predictive analytics

Other than metrics and data representation, the predictive side of analytics is quite important as well. Betting companies, simulation games, and social media managers use the results of complex machine learning algorithms to forecast a tournament ranking or for engaging communities .. etc.

Two types of machine learning algorithms are used:

  • regression-type supervised learning is used to get winning probabilities for instance.
  • classification-type supervised learning is used to provide encounters outcome.

Robust models are built for supervised ML based on previous seasons’ detailed data. You can check Opta’s predictive model for the World Cup.

Fantasy Football

One of the fields that uses big data the most, is fantasy football, it is so popular that the user base as well as the number and types of requests are exploding.

In addition to the individual views, each user has his own configurations, follows one team or more at the same time, and also there are huge activity peaks on certain time windows.

The operators are obliged to make partnerships with data firms to handle the traffic and all the other big data challenges.

One of the key features in fantasy football, is the scoring system. Oulala Games Ltd company, which was Leicester City FC’s partner, has built an award winning platform thanks to a highly efficient algorithm.

Their algorithm computes an overall score using some crucial aspects of an athlete’s performance. Their system includes a total of 70 different criteria dependent on a player’s position (keeper, defender, midfielder and striker) resulting in a total of 275 ways to gain or lose points. The aggregated numbers of these actions, made by players who are included in virtual teams, give the overall winners of different daily leagues.


Time’s up, this is the end of this Football special.

There is still a lot to say, but I hope at least that this overview gave you the will to learn more about Big data in Football.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store