An xGoal (short for Expected Goals) is the calculated probability of a player scoring a goal when shooting from any position on the pitch. With the Bundesliga Match Fact xGoals, the DFL can assess the probability of a player scoring a goal when shooting from any position on the field. We use cookies to ensure that we give you the best experience on our website. As suggested by the vertical axis on the right side of the plot, a red data point indicates a higher value of the feature, and a blue data point indicates a lower value. Mit Speed Alarm folgte eine weitere Auswertung. How Eintracht Frankfurt beat Bayern Munich: AWS match facts analysis 5 weeks ago. An elegant feature of the SHAP framework is that it’s both model agnostic and highly scalable, working on both simple linear models and deep, complex neural networks with hundreds of layers. We use the open-source SHAP library to plot the SHAP values that are computed inside our processing job. The new Match Facts – Most Pressed Player, which highlights how often a player in possession experiences a significant pressure situation throughout a match; Attacking Zones, which shows fans where their favourite team is attacking and which side of the pitch they view as most likely to score from; and Average Positions – Trends, which shows how changes to a team’s tactical formation can impact a match’s outcome – will debut during Matchday 21 on February 12th featuring RB Leipzig vs. FC Augsburg. As has always been the case in football, every aspect of a shot can be too perfect that no human, let alone an advanced ML model, can predict their outcome. Fans will see these insights as graphics during broadcasts and in the official Bundesliga app throughout the 2020/2021 season and beyond. The new Match Facts – Most Pressed Player, which highlights how often a player in possession experiences a significant pressure situation … The new Match Facts - Most Pressed Player, which highlights how often a player in possession experiences a significant pressure situation throughout a match; Attacking Zones, which shows fans where their favorite team is attacking and which side of the pitch they view as most likely to score from; and Average Positions - Trends, which shows how changes to a team's tactical formation can impact a match's outcome - will … All qualitative descriptions (such as small, low, and large) are in relation to the average values across the dataset for each respective feature. Crucially, this can be seen as a direct move away from the underlying models being closed boxes for which we can observe the inputs and outputs, but not the internal workings. The new Bundesliga Match Facts powered by AWS. Unsurprisingly, few headers were scored with an angle less than 25. Zum kommenden 21. Although a full explanation of this method is beyond the scope of this post, at its core SHAP builds out model explanations by posing the following question: “How does a prediction change when a certain feature is removed from our model?” The SHAP values are the answer to this question—they directly compute the contribution of a feature’s effect on a prediction in terms of both magnitude and direction. Our analysis so far has focused solely on explainability results for the entire dataset—global explanations—so we now explore some particularly interesting matches and their goal events, looking at what is referred to as a local explanation. With real-time ball and player positions, a bespoke model can determine an array of additional features such as the angle to the goal, the distance of a player to the goal, a player’s speed, the number of defenders in the line of shot, and goalkeeper coverage, to name just a few. Die Bundesliga Match Facts powered by AWS werden um drei Elemente erweitert: Angriffszonen, Most Pressed Player und Realformation: Trends. Whether this solution allows fantasy football players an edge in their local league, provides managers with an objective assessment of a player’s current (and predicted) future performance, or serves as a conversation starter for notable football pundits in identifying offensive and defensive trends for particular players and teams, you can already appreciate the tangible value created across all areas of the football ecosystem by applying Clarify to Bundesliga Match Facts. Bundesliga Match Facts are generated by gathering and analyzing data … Better interpretability leads to better adoption. The following plot shows that relationship for our most important features: When we take a closer look at two of our (less influential) categorical variables, we see that, all other things being equal, a header invariably decreases the likelihood of a goal, whereas a freekick increases it. More interestingly, however, when comparing the affects that a header or FootShot has on the likelihood of a goal being scored, we see that for any given angle in the range 25–75, a header reduces it. Outside of work he loves to spend time travelling, trying new cuisines and reading about science and technology. That has all changed with the announcement of Clarify, which offers you the ability to detect bias and implement model explainability in a repeatable and scalable manner. We can see a negative relationship between the DistanceToGoal and the target variable, with the likelihood of a goal increasing as we get closer to the goal. This advanced statistic will also compare the number of pressing situations a player faces while in possession of the ball with the average number of pressure situations faced by their teammates, helping determine which players are under the most pressure. These statistics are delivered to viewers via national and international broadcasters, as well as DFL’s platforms, channels, and apps. When we look at a few of the rows of the original training dataset, we get an idea of the types of features we’re dealing with; a mix of binary, categorical, and continuous values across a large dataset of attempted shots at goal. As part of this partnership, they have stepped up efforts involving the live processing of data regarding … This new Match Fact divides the last third of the pitch into four equally sized Attacking Zones. Bundesliga Match Facts: Analítica en tiempo real gracias a la tecnología AWS. Attacking Zones: As teams look to exploit defensive weaknesses, approach their opponent’s goal, and ultimately score, Attacking Zones allows fans to see where the teams focus their offense to create those scoring opportunities. In the preceding case, none of the features are capable of counteracting the high AngleToGoal (56.37), low AmountOfDefenders (1.0), and low DistanceToGoal (6.63) for this shot at goal. Moritz Mücke erläutert, worum es dabei geht: »Grundsätzlich ist es unser Ziel, mit den Bundesliga Match Facts Leistungen von Spielern … Looking back closely at our initial global summary plot, we can see some uncertainty (represented by the dense clustering around the zero SHAP value mark) for the features PressureSum and PressureMax. The base value that we see is the average xGoals value across every attempted shot in the Bundesliga in the past three seasons sits at 0.0871! These statistics are delivered to viewers via national and international broadcasters, as … Amazon Web Services is powering three new advanced statistics to appear on-screen during Bundesliga broadcasts this season. Information on all of these statistics can be found on aws.amazon.com/sports/bundesliga. We can dive even deeper and look at the SHAP feature dependence plots, arguably the simplest global interpretation. Unsurprisingly, a strong inverse relationship exists between DistanceToGoal and PressureSum for those match events with a high goal prediction; as the former decreases, the latter rises. At the other extreme, there are certain goals that our XGBoost model can’t predict and the SHAP values can’t explain. AWS & Bundesliga to deliver real time game analysis, OneFootball & Bundesliga launch FTA model in LatAm, Friday Night Bundesliga on Eurosport Player, Study: Watch Parties gain foothold during pandemic, Report: Social video new SVoD battlefield, 90% of UK adults used BBC services in last 12 months, Loveworld fined £125K for Covid conspiracy theories, FCC researching US “broadband experiences”, Ireland: Survey highlights broadband importance. One of the most exciting AWS re:Invent 2020 announcements was a new Amazon SageMaker feature, purpose built to help detect bias in machine learning (ML) models and explain model predictions: Amazon SageMaker Clarify. The same is true across all the individual clubs in the Bundesliga competition, with only a handful of clubs deviating from the norm. We simply select a feature and then plot the feature value on the x-axis and the corresponding SHAP value on the y-axis. The following plot is an example of a global explanation, which allows us to understand the model and its feature combinations in aggregate over multiple data points. Running the following code sets the processing job in motion: After we run our Clarify explainability analysis over the entirety of our xGoals training set, we can quickly and easily view the global SHAP values and their distribution for each feature, thereby allowing us to map how either positive or negative changes in the value of a given feature affects the final prediction. The experimental results in this post demonstrate that we have: In real-world scenarios as complex as a football game, conventional or logic-specific rule-based systems start to break down upon application, failing to offer any sort of match event prediction let alone an in-depth explanation of how it was made. One particularly interesting use case for Clarify is from the Deutsche Fußball Liga (DFL) on Bundesliga Match Facts powered by AWS, with the goal of uncovering interesting insights into the xGoals model predictions. All rights reserved. Average Positions – Trends: This new statistic helps fans, coaches, and commentators identify team strategies by showing how the average positions of players on the pitch change during any desired time frame in the game. These advanced statistics help audiences better understand areas like decision-making on the pitch and the probability of a goal for each shot. It started small. We use the Shapley interaction index from game theory to compute the SHAP interaction values for all features to acquire one matrix per instance with dimensions F X F, where F is the number of features. The data is then provided back to broadcast viewers around the world in real-time as statistics. This can be simplified as follows: if your favorite player has the ball at their feet while at a wide angle to the goal, they’re more likely to score it than if the ball is soaring through the air! The only features working to increase his chances of scoring were the fact that he had very little pressure on him at the time, with only two players in the local vicinity capable of closing him down. This additional match information kicks off with "Average Positions" and "Expected Goals" (xGoals): Average Positions tracks players' average location on the pitch in … Keeping in mind that, based on the preceding results, a high angle to goal increases the likelihood of scoring a goal, we can look at the SHAP value of the number of defenders and determine that this is only the case when only one or two defenders are near the attacker. “Amazon SageMaker Clarify brings the power of state-of-the-art explainable AI algorithms to the fingertips of our developers in a matter of minutes and seamlessly integrates with the rest of the Bundesliga Match Facts digital platform—a key part of our long-term strategy of standardizing our ML workflows on Amazon SageMaker,” reports Gabriel Anzer, Data Scientist at Sportec Solutions (STS), a key partner organization of Bundesliga Match Facts powered by AWS. Most Pressed Player shows how often a player in possession of the ball experiences a significant pressure situation by measuring the number of opposing players involved, their distance to the player, as well as the direction of every players’ movement. Theoretical approaches for overcoming this lack of model explainability have undeniably matured in recent years, with one standout framework becoming a crucial tool in the world of explainable AI: SHAP (SHapley Additive Explanations). “Every Bundesliga match generates data that can improve play and help fans better understand team strategies, and we are making tremendous strides in leveraging the vast amount of data in our archives and from our league’s current games to develop and roll-out new Match Facts. © 2021, Amazon Web Services, Inc. or its affiliates. Bundesliga Match Facts are generated by gathering and analyzing data from live game video feeds as they’re streamed into AWS. As the complexity, depth, and richness of the Bundesliga Match Facts dataset continues to grow, the team is continuously exploring new and exciting ideas for additional match facts and how to tweak our best in-production models in light of insightful explainability results. He works with clients across industries to help them tell stories with data using machine learning. Most Pressed Player: Football teams are using pressure as a technique, both offensively and defensively, to disrupt a player’s rhythm. Mit den Bundesliga Match Facts gab es im Mai erste Statistiken (Torwahrscheinlichkeit, »XGoals« und »Realinformation«). Echtzeitstatistik mit AWS. Moritz Mücke, Head of Digital Innovation bei der DFL. Nick McCarthy is a Data Scientist in the AWS Professional Services team. Im Rahmen der Zusammenarbeit wurde die Liveverarbeitung von Daten zu innovativen Statistiken intensiviert – entstanden sind daraus die „Bundesliga Match Facts powered by AWS“, die Medienpartnern, Fans und Öffentlichkeit sowie den Clubs während der Partien in den Übertragungen sowie online zur Verfügung gestellt werden. Voted to be the best goal of the 2019–2020 season by 22% of Bundesliga viewers, Emre Can’s jaw-dropping strike was given a near-zero (3%) chance of going in and, taking into account his great distance from the goal (approximately 30 meters) and at such a flat angle (11.55 degrees), we can see why. “In just one year since the launch of Match Facts, AWS and Bundesliga have created statistics that are giving fans around the world a completely new way to experience the game. Bundesliga’s new Match Fact, Attacking Zones, from AWS. Bundesliga Match Facts powered by AWS provides advanced real-time statistics and in-depth insights, generated live from official match data, for Bundesliga matches. It builds on an existing Match Fact, Average Positions (which has been available since the 2019-2020 season), by offering the flexibility to analyse any portion of the game, rather than just at the half or the game’s end. Insights are generated by gathering and analysing data from live game video feeds as they’re streamed into AWS. The features AngleToGoal, DistanceToGoal, and DistanceToGoalClosest play the most important roles in predicting our target variable, namely whether a goal is scored or not. From this you can logically infer, for example, that an increase in the angle to goal leads to higher log odds for prediction (which is associated with True predictions for a goal being scored or not). Until now, it was not possible to quantify the pressure put on an individual player. Bundesliga Match Facts powered by AWS provides advanced real-time statistics and in-depth insights, generated live from official match data, for Bundesliga matches. When we apply Clarify, we can both enhance goal prediction models and contextualize football match events on a per-play basis. As you move further away from the goal, the angle reduces. Nearly all goals that are scored close to the goal are hit with an angle greater than 45 degrees. This, in tandem with inevitable and ongoing Clarify updates and improvements, opens up a wealth of exciting avenues going forward for both xGoals and Bundesliga Match Facts. First, let’s define the Clarify processing job, along with the SageMaker session, AWS Identity and Access Management (IAM) execution role, and Amazon Simple Storage Service (Amazon S3) bucket with the following code: We can save the CSV training file to Amazon S3, and then specify the training data and results path for the Clarify job as follows: Now that we have instantiated the Clarify processor and defined our explainability training dataset, we can start to specify our problem-specific experimental configuration: The following are important input parameters to note, as seen in the preceding relevant code snippet: We directly pass the important parameters into our clarify.ModelConfig, clarify.SHAPConfig, and clarify.DataConfig instances. Through this, over 500 million Bundesliga fans around the world gain more advanced insights into players, teams, and the league, and are delivered a more personalized experience and the next generation of statistics. Nick’s background is in Astrophysics and Machine Learning and, despite occasionally following the Bundesliga, he has been a Manchester United fan from an early age! Gabriel’s background is in Mathematics and Machine Learning, but he is additionally pursuing his PhD in Sports Analytics at the University of Tübingen and working on his football coaching license. Like most ML tools, it was missing a way of diving deeper and explaining the results of said models, or investigating training datasets for potential bias. We used the area under the ROC curve (AUC) as the objective metric for our training job, and trained the xGoals model on over 40,000 historical shots at goals in the Bundesliga since 2017, using the Amazon SageMaker XGBoost algorithm. Ook geeft AWS met match facts al tijdens de wedstrijd inzicht in de kans dat een schot tot een doelpunt gaat leiden en krijg je te zien hoe snel spelers zich verplaatsen over het veld. Amazon Web Services (AWS), an Amazon.com, company, and the German Bundesliga, Germany’s top national football league, have announced three new Bundesliga Match Facts powered by AWS to give fans deeper insights into action on the pitch. It’s worth noting that for regions that have an increased vertical dispersion of results, we simply have a higher concentration of data points that are overlapping, which gives us a sense of the distribution of the Shapley values per feature. Bundesliga Match Facts powered by AWS provides a more engaging fan experience during soccer matches for Bundesliga fans around the world. The primary implications for Bundesliga Match Facts powered by AWS going forward are twofold. The advanced statistics that we’re creating with AWS give fans an even deeper appreciation for how the game is played,” said Andreas Heyden, Executive Vice President of Digital Innovations for DFL Deutsche Fußball Liga GmbH. At the beginning of last year DFL and AWS began working together on several areas, including data analysis. With its roots in coalition game theory, SHAP values aim to characterize the feature values of a data instance as players in a coalition, and subsequently tells us how to fairly distribute the payout (the prediction) among the various features. AWS powers more Bundesliga Match Facts. We need to have consistency between the two feature sets for model training and Clarify processing. We can start to see the value in using SHAP values to analyze seasons’ worth of data, because we have quickly identified a universal trend in the data. The MarketWatch News Department was not involved in the creation of this content. In 2006, Amazon launched two fairly simple services: computers you could rent by … Bundesliga Match Facts are generated by gathering and analyzing data from live game video feeds as they’re streamed into AWS. These three new Match Facts join Speed Alert , Average Positions, and xGoals to bring the total number of insights available for Bundesliga fans to six. “In just one year since the launch of Match Facts, AWS and Bundesliga have created statistics that are giving fans around the world a completely new way to experience the game. “In just one year since the launch of Match Facts, AWS and Bundesliga have created statistics that are giving fans around the world a completely new way to experience the game. Fans will see these insights as graphics during broadcasts and in the official Bundesliga app throughout the 2020/2021 season and beyond. The XGBoost model starts its prediction at this baseline, with positive and negative forces that either increase or decrease the prediction. These three new Match Facts join Speed Alert, Average Positions, and xGoals to bring the total number of insights available for Bundesliga fans to six. Media partners and commentators can now choose which time spans to analyse and then compare those sections of the match, making it easier to identify tactical trends such as whether a team visibly reacts or begins a period of increased pressure after a significant event such as a goal, red card, or substitution. We can use interaction plots to deep dive into these values and try to unpack and identify what is causing this. It gives viewers information on the difficulty of a shot, the performance of their favorite players, and can illustrate the offensive and defensive trends of their team. But this was clearly not enough to stop Can. Amazon Web Services (AWS), an Amazon.com, company, and the German Bundesliga, Germany’s top national football league, have announced three new Bundesliga Match Facts powered by AWS to give fans deeper insights into action on the pitch. For example, suppose we want to know how the variables DistanceToGoal and PressureSum interact, and the affect they have on the SHAP value for the DistanceToGoal. The positive and negative impact on the goal prediction value is shown on the x-axis, derived from our SHAP values. It’s reassuring to have our feature interaction plots confirm our preconceived ideas of the game, as well as quantify the various powers at play. When we look back at one of the most interesting games of the 2019–2020 season, where Bayer 04 Leverkusen beat Borussia Dortmund in a 4–3 thriller on February 8, 2020, we can look at the varying affects each feature has on the xGoals values (the model output value we see on the horizontal axis). Although none of our match events were penalties (all having a feature value =1), it must still be included in the Clarify processing job because it was also included in the original XGBoost model training. The pace of innovation we’ve achieved in rolling out these advanced stats will excite even the most rabid fans, help teams shape their strategies, and introduce a whole new generation to the intricacies of football.”. With Clarify, the DFL can now interactively explain what some of the key underlying features are in determining what led the ML model to predict a certain xGoals value. All the arguments in this processor are generic and are related only to your current production environment and the AWS resources at your disposal. Fans will see these insights as graphics during broadcasts and in the official Bundesliga app throughout the 2020/2021 season and beyond. Dit is slechts een greep van alle wedstrijdinzichten die Luuk voor de Bundesliga heeft ontwikkeld. “Together with AWS, we’re delivering a new perspective on what happens on the field and offering a new and engaging way for fans to follow their favourite teams.”, “Expanding our work with Bundesliga means more fans will gain an appreciation for the incredible talent on the field and the decisions made by teams, at the same time as the league differentiates itself through the use of advanced analytics to improve the quality of play,” added Klaus Buerg, General Manager for AWS Germany, Austria, and Switzerland, Amazon Web Services EMEA SARL. (German-language video) MPEG-4 Video February 11, 2021. When we compare this plot across the three seasons (2017–2018, 2018–2019, and 2019–2020), we see little to no change in both the feature importance and their associated SHAP value distribution. With this interaction index, we can then color the SHAP feature dependence plot with the strongest interaction. This makes sense; how often is it that you see someone score a goal from the sideline when 40 meters out? The dashed lines are those match events in which a goal occurred. In today’s world where predictions are made by ML algorithms at scale, it’s increasingly important for large tech organizations to be able to explain to their customers why they made a certain decision based on an ML model’s prediction. The features are ordered according to their importance, from top to bottom. Gabriel Anzer is the lead data scientist at Sportec Solutions AG, a subsidiary of the DFL. Luuk Figdor is a data scientist in the AWS Professional Services team. Deutsche Fußball Liga (DFL) has begun using three new Match Facts powered by Amazon Web Services (AWS), adding to four released last year. We see how, starting from the bottom and working our way up, the features start to have an ever-increasing impact on the final prediction, with some extreme cases showcasing how AngleToGoal, DistanceToGoalClosest, and DistanceToGoal really have the final say in our XGBoost model’s probability prediction. This not only opens up avenues of further analysis, so as to iterate and further improve on model configurations, but also provides previously unseen levels of model prediction analysis to customers. The pace of innovation we’ve achieved in rolling out these advanced stats will excite even the most rabid fans, help teams shape their strategies, and introduce a whole new generation to the intricacies of football.”.