AWS Machine Learning Blog
Bundesliga Match Fact Set Piece Threat: Evaluating team performance in set pieces on AWS
The importance of set pieces in football (or soccer in the US) has been on the rise in recent years: now more than one quarter of all goals are scored via set pieces. Free kicks and corners generally create the most promising situations, and some professional teams have even hired specific coaches for those parts of the game.
In this post, we share how the Bundesliga Match Fact Set Piece Threat helps evaluate performance in set pieces. As teams look to capitalize more and more on these dead ball situations, Set Piece Threat will help the viewer understand how well teams are leveraging these situations. In addition, it will explain the reader how AWS services can be used to compute statistics in real-time.
Bundesliga’s Union Berlin is a great example for the relevance of set pieces. The team managed to rise from Bundesliga 2 to qualification for a European competition in just 2 years. They finished third in Bundesliga 2 during the 18/19 season, earning themselves a slot in the relegation playoffs to the Bundesliga. In that season, they scored 28 goals from open play, ranking just ninth in the league. However, they ranked second for goals scored through set pieces (16 goals).
Tellingly, in the first relegation playoff match against VfB Stuttgart, Union secured a 2:2 draw, scoring a header after a corner. And in the return match, Stuttgart was disallowed a free kick goal due to a passive offside, allowing Union to enter the Bundesliga with a 0:0 draw.
The relevance of set pieces for Union’s success doesn’t end there. Union finished their first two Bundesliga seasons with strong eleventh and seventh, ranking third and first in number of set piece goals (scoring 15 goals from set pieces in both seasons). For comparison, FC Bayern München—the league champion—only managed to score 10 goals from set pieces in both seasons. The success that Union Berlin has had with their set pieces allowed them to secure the seventh place in the 20/21 Bundesliga season, which meant qualification for the UEFA Europa Conference League, going from Bundesliga 2 to Europe just 2 years after having earned promotion. Unsurprisingly, in the deciding match, they scored one of their two goals after a corner. At the time of this writing, Union Berlin ranks fourth in the Bundesliga (matchday 20) and first in corner performance, a statistic we explain later.
Union Berlin’s path to Europe clearly demonstrates the influential role of offensive and defensive performance during set pieces. Until now however, it was difficult for fans and broadcasters to properly quantify this performance, unless they wanted to dissect massive tables on analytics websites. Bundesliga and AWS have worked together to illustrate the threat that a team produces and the threat that is produced by set pieces against the team, and came up with the new Bundesliga Match Fact: Set Piece Threat.
How does Set Piece Threat work?
To determine the threat a team poses with their set pieces, we take into account different facets of their set piece performance. It’s important to note that we only consider corners and free kicks as set pieces, and compute the threat for each category independently.
Facet 1: Outcome of a set piece: Goals, shots, or nothing
First, we consider the outcome of a set piece. That is, we observe if it results in a goal. However, the outcome is generally influenced by fine margins, such as a great save by the goal keeper or if a shot brushes the post instead of going in, so we also categorize the quality of a shot that results from the set piece. Shots are categorized into several categories.
Category | Explanation |
Goal | A successful shot that lead to a goal |
Outstanding | Shots that almost led to a goal, such as a shot at the post |
Decent | Other noteworthy goals scenes |
Average | The rest of the chances that would be included in a chances ratio with relevant threat of a goal |
None | No real goal threat, should not be considered a real chance, such as a header that barely touched the ball or a blocked shot |
No shot | No shots taken at all |
The above video shows examples of shot outcome categories in the following order: outstanding, decent, average, none.
Facet 2: Potential of a shot
Second, our algorithm considers the potential of a shot. This incorporates how likely it should have resulted in a goal, taking the actual performance of the shot-taker out of the equation. In other words, we quantify the goal potential of the situation in which the shot was taken. This is captured by the expected goal (xGoals) value of the shot. We remove not only the occurrence of luck or lack thereof, but also the quality of the strike or header.
Facet 3: Quantity of set pieces
Next, we consider the aspect of pure quantity of set pieces that a team gets. Our definition of Set Piece Threat measures the threat on a per-set-piece-basis. Instead of summing up all outcomes and xGoal values of a team over the course of a season, the values are aggregated such that they represent the average threat per set piece. That way, the corner threat, for example, represents the team’s danger for each corner and doesn’t consider a team more dangerous simply because they have more corners than other teams (and therefore potentially more shots or goals).
Facet 4: Development over time
The last aspect to consider is the development of a team’s threat over time. Consider for example a team that scored three goals from corners in the first three matchdays but fails to deliver any considerable threat over the next 15 matchdays. This team should not be considered to pose a significant threat from corners on matchday 19, despite it already having scored three times, which may still be a good return. We account for this (positive or negative) development of a team’s set piece quality by assigning a discount to each set piece, depending on how long ago it occurred. In other words, a free kick that was taken 10 matchdays ago has less influence on the computed threat than one that was taken during the last or even current game.
Score: Per set piece aggregation
All four facets we’ve described are aggregated into two values for each team, one for corners and one for free kicks, which describe the danger that a corresponding set piece by that team would currently pose. The value is defined as the weighted average of the scores of each set piece, where the score of a set piece is defined as (0.7 * shot-outcome + 0.3 * xG-value)
if the set piece resulted in a shot and 0 otherwise. The shot-outcome
is 1 if the team scored and lower for other outcomes, such as a shot that went wide, depending on its quality. The weight for each set piece is determined by how long ago it was taken, as described earlier. Overall, the values are defined between 0–1, where 1 is the perfect score.
Set piece threat
Next, the values for each team are compared to the league average. The exact formula is score(team)/avg_score(league) - 1
. This value is what we call the Set Piece Threat value. A team has a threat value of 0 if it’s exactly as good as the league average. A value of -1 (or -100%) describes a team that poses no threat at all, and a value of +1 (+100%) describes a team that is twice as dangerous as the league average. With those values, we compute a ranking that orders the teams from 1–18 according to their offensive threat of corners and free kicks, respectively.
We use the same data and similar calculations to also compute a defensive threat that measures the defensive performance of a team with regard to how they defend set pieces. Now, instead of computing a score per own set piece, the algorithm computes a score per opponent set piece. Just like for the offensive threat, the score is compared to the league average, but the value is reversed: -score(team)/avg_score(league) + 1
. This way, a threat of +1 (+100%) is achieved if team allows opponents no shots at all, whereas a team with defensive threat of -1 (-100%) is twice as susceptible to opponents’ set pieces as the league average. Again, a team with a threat of 0 is as good as the league average.
Set Piece Threat findings
An important aspect of Set Piece Threat is that we focus on an estimation of threat instead of goals scored and conceded via set pieces. If we take SC Freiburg and Union Berlin at matchday 21 as an example, over the course of this season Freiburg has scored seven goals via corners in comparison to four from Union Berlin. Our threat ranking still ranks both of the teams fairly equal. In fact, we predict a corner by Freiburg (Rank 3) to even be 7% less threatening than a corner by Union Berlin (Rank 1). The main reason for this is that Union Berlin created a similar number of great chances out of their corners, but failed to convert these chances into goals. Freiburg on the other hand was vastly more efficient with their chances. Such a discrepancy between chance quality and actual goals can happen in a high-variance sport like football.
The following graph shows Union Berlin’s set piece offensive corner ranking (blue) and score (red) from matchdays 6–21. At matchday 12, Union scored a goal from a corner and additionally had a great chance from a second corner that didn’t result in a goal but was perceived as a high threat by our algorithm. In addition, Union had a shot on target in five of seven corner kicks on matchday 12. Union immediately jumped in the ranking from twelfth to fifth place as a result of this, and the score value for Union increased as well as the league average. As Union saw more and more high threat chances in the later matchdays from corners, they step by step claimed first place of the corner threat ranking. The score is always relative to the current league average, meaning that Union’s threat at matchday 21 is 50% higher from corners than the average threat coming from all teams in the league.
Implementation and architecture
Bundesliga Match Facts are independently running AWS Fargate containers inside Amazon Elastic Container Service (Amazon ECS). Previous Bundesliga Match Facts consume raw event and positional data to calculate advanced statistics. This changes with the release of Set Piece Threat, which analyzes data produced by an existing Bundesliga Match Fact (xGoals) to calculate its rankings. Therefore, we created an architecture to exchange messages between different Bundesliga Match Facts during live matches in real time.
To guarantee the latest data is reflected in the set piece threat calculations, we use Amazon Managed Streaming for Apache Kafka (Amazon MSK). This message broker service allows different Bundesliga Match Facts to send and receive the newest events and updates in real time. By consuming a match and Bundesliga Match Fact-specific topic from Kafka, we can receive the most up-to-date data from all systems involved while retaining the ability to replay and reprocess messages sent earlier.
The following diagram illustrates the solution architecture:
We introduced Amazon MSK to this project to generally replace all internal message passing for the Bundesliga Match Facts platform. It handles the injection of positional and event data, which can aggregate to over 3.6 million data points per match. With Amazon MSK, we can use the underlying persistent storage of messages, which allows us to replay games from any point in time. However, for Set Piece Threat, the focus lies on the specific use case of passing events produced by Bundesliga Match Facts to other Bundesliga Match Facts that are running in parallel.
To facilitate this, we distinguish between two types of Kafka topics: global and match-specific. First, each Bundesliga Match Fact has an own specific global topic, which handles all messages created by the Bundesliga Match Fact. Additionally, there is an additional match-specific topic for each Bundesliga Match Fact for each match that is handling all messages created by a Bundesliga Match Fact for a specific match. When multiple live matches run in parallel, each message is first produced and sent to this Bundesliga Match Fact-specific global topic.
A dispatcher AWS Lambda function is subscribed to every Bundesliga Match Fact-specific global topic and has two tasks:
- Write the incoming data to a database provisioned through Amazon Relational Database Service (Amazon RDS).
- Redistribute the messages that can be consumed by other Bundesliga Match Facts to a Bundesliga Match Fact-specific topic.
The left side of the architecture diagram shows the different Bundesliga Match Facts running independently from each other for every match and producing messages to the global topic. The new Set Piece Threat Bundesliga Match Fact now can consume the latest xGoal values for each shot for a specific match (right side of the diagram) to immediately compute the threat produced by the set piece that resulted in one or more shots.
Summary
We’re excited about the launch of Set Piece Threat and the patterns commentators and fans will uncover using this brand-new insight. As teams look to capitalize more and more on these dead ball situations, Set Piece Threat will help the viewer understand which team is doing this successfully and which team still has some ground to cover, which adds additional suspense before each of these set piece situations. The new Bundesliga Match Fact is available to Bundesliga’s broadcasters to uncover new perspectives and stories of a match, and team rankings can be viewed at any time in the Bundesliga app.
We’re excited to learn what patterns you will uncover. Share your insights with us: @AWScloud on Twitter, with the hashtag #BundesligaMatchFacts.
About the Authors
Simon Rolfes played 288 Bundesliga games as a central midfielder, scored 41 goals and won 26 caps for Germany. Currently Rolfes serves as Sporting Director at Bayer 04 Leverkusen where he oversees and develops the pro player roster, the scouting department and the club’s youth development. Simon also writes weekly columns on Bundesliga.com about the latest Bundesliga Match Facts powered by AWS
Luuk Figdor is a Senior Sports Technology Specialist in the AWS Professional Services team. He works with players, clubs, leagues, and media companies such as Bundesliga and Formula 1 to help them tell stories with data using machine learning. In his spare time, he likes to learn all about the mind and the intersection between psychology, economics, and AI.
Jan Bauer is a Cloud Application Architect at AWS Professional Services. His interests are serverless computing, machine learning, and everything that involves cloud computing. He works with clients across industries to help them be successful on their cloud journey.
Pascal Kühner is a Cloud Application Developer in the AWS Professional Services Team. He works with customers across industries to help them achieve their business outcomes via application development, DevOps, and infrastructure. He loves ball sports and in his spare time likes to play basketball and football.
Uwe Dick is a Data Scientist at Sportec Solutions AG. He works to enable Bundesliga clubs and media to optimize their performance using advanced stats and data—before, after, and during matches. In his spare time, he settles for less and just tries to last the full 90 minutes for his recreational football team.
Javier Poveda-Panter is a Data Scientist for EMEA sports customers within the AWS Professional Services team. He enables customers in the area of spectator sports to innovate and capitalize on their data, delivering high quality user and fan experiences through machine learning and data science. He follows his passion for a broad range of sports, music and AI in his spare time.