Passing networks with expected threat (xT) layer. Walking through popular templates. Explaining the details.
In this article, we’ll explore the passing network map, also known as a pass map, which is a visual representation of successful passes between teammates on a football field. Commonly used to analyze a single match, it can also track passing trends over a longer period. We’ll learn how to interpret these infographics accurately, understand their complexities, and even create our own pass map template using Opta data for future post-match analyses. Let’s dive in.
Below is a basic design for pass map.
This infographic’s key characteristics include:
1.Layout of the field: In this example, we’ve used a horizontal orientation with the attacking team positioned on the left. Another common layout is vertical, where the attacking team appears at the bottom.
2.Nodes represent players: Every player is shown as a point, or node, on the field. The node’s location reflects the player’s “average” position from where they made passes during a certain period of the match. It’s important to note a few subtleties in this representation.
a) aggregated statistic
The coordinates shown are a summary statistic that might not always give a true representation of the situation. For example, consider a player who only made passes from his own penalty area and the opponent’s penalty area. In this case, his average position would be shown as the midfield, even though he might never have actually been in that area or passed the ball from there. While this is a hypothetical scenario, it highlights a significant limitation of this visualization method.
b) average and outliers
Another point to consider is how outliers can affect average estimates. Let’s say a player mostly made passes in his own third of the field but briefly entered the final third a few times, making just a couple of passes. This could shift his average position to the midfield, inaccurately representing his usual position in the match. To counteract this, there are used a median coordinate estimate instead of an average in many examples of code, as it’s less influenced by such outliers.
c) aggregation period
To create a pass map, we typically use data from the 11-player lineup that spent the most time on the field during a match. This is usually from the start of the game until the first substitution, which often happens in the second half (as in the example of Manchester City in their Premier League match against Brighton shown below). The time before a team receives its first red card might also be considered.
In cases where a substitution or red card occurs early in the match, we look at the next longest interval without these events (like the pass map for Brighton in the example below).
However, you can also create pass maps for different intervals during the match if desired. This can be useful because a team’s playing pattern might change even without substitutions or red cards.
3. Edges represent passes: In these maps, nodes (representing players) are connected by edges, shown as lines or arrows. Each edge signifies successful passes between two players. The line’s thickness usually indicates the number of passes shared between the players. In simpler versions, the direction of the passes isn’t always considered, making it unclear who initiated more passes in a pair without extra stats.
Adding arrows makes the infographic more informative, showing the direction of passes, but it still doesn’t qualitatively show how pass distribution varies between each pair of nodes. For instance, it’s possible to see if most passes were one-way or two-way between pairs, but it’s hard to visually gauge if there was a balance or dominance in passes from one player to another.
Also, you might notice missing edges between some nodes, like in the example above. This absence could mean either no passes occurred between those two players or a filter was applied. Filters often set for 5–15 passes. If passes between players fall below a set threshold, those edges are not shown to avoid cluttering the map.
The image below presents an alternative method for visualizing edges. In this approach, we use two lines or arrows for each player in a pair. This technique provides more detailed information compared to the earlier-discussed methods.
The map featured below highlights the Van Dijk-Keita pair, where the majority of passes originated from the Dutch player. In the Van Dijk-Robertson pair, the Scottish player has a marginal edge. However, this method sacrifices some readability for detail.
Importantly, this latest infographic introduces a new aspect compared to the previous ones. Here, the size of each node corresponds to the total passes made by that player. This visual technique simplifies comparing players by showcasing their pass count during the game, enhancing match analysis and football strategy insights.
First outputs:
In the basic configuration of the pass map, only the sizes of the nodes and the thickness of the edges are adjusted. This method of visualization certainly contains useful information, but it is not the most optimal.
Adding additional information to the map
When analyzing player passes in football, it’s crucial to look beyond just the total number and accuracy percentage. The effectiveness of these passes is a key aspect. Metrics like xGChain and xGBuildup were among the first to assess a player’s contribution to creating scoring opportunities against the opposition.
xGChain is a metric that was created to be able to evaluate not only the effectiveness of the final shot (xG) and assists (xA) but also the contribution of other players involved in the play. It considers all possession chains in which a player participates (i.e., takes action with the ball) leading up to a shot on goal. In each such chain every player involved in moving the ball is assigned an xG value equal to that of the final shot in the possession.
The xGChain for each player is calculated by summing up the xG values from every possession chain they participated in. Essentially, the final xG score of each attack is distributed among all players involved in the play. The cumulative xGChain score for each player represents their total contribution across all attacks.
Notably, xGChain was the first metric to be added into pass maps using a color scale. This visual representation colors the nodes and edges according to the xGChain values achieved during the match, providing a clear and insightful depiction of each player’s impact.
You can see an example of possession chains with xGChain estimations in the linked article.
Example of a map with xGChain metric
Consider the passing map template provided by StatsBomb (featured in the article).
The infographic above displays two pass maps for two Manchester City’s matches in the 2020–2021 season. The size of the nodes indicates key players in the respective games, based on their number of successful passes. The thickness of the lines reveals which player pairs made the most passes and in which direction. Additionally, the infographic allows us to assess:
- The overall contribution of each player’s passes to the potential threat posed at the opponent’s goal (indicated by the node color).
- The player pairs through which the most dangerous possession chains were constructed (indicated by the line color).
Another key aspect and best practice is the normalization of the color scale. StatsBomb does this by assigning the coldest color (dark blue) to represent the 5th percentile of metric values over a certain historical period (likely, similar to radars, using the metric distribution from the last five years in the top five European leagues). The warmest color (bright red) represents the 95th percentile of metric values (the highest scores).
This normalization is applied separately for nodes and edges. For nodes, the distribution of the xGChain metric per 90 minutes among players is considered. For edges, the metric values between each pair of players, calculated per 90 minutes, are used (as mentioned in the original StatsBomb article).
In the left map, illustrating the match between West Ham and Manchester City, you can observe the absence of red-colored edges and nodes. The most prominent color (yellow fading to orange) is associated with Eric Garcia, indicating that his passes achieved the highest xGChain value in the match. This value, as suggested by the color, corresponds to an “average” level for this metric.
Analyzing the color scheme of the passes allows us to identify effective combinations such as Dias-Garcia, Garcia-Gundogan, and Cancelo-Gundogan, leading to the conclusion that the most efficient xG attacks often started with delivering the ball to Ilkay.
Certainly, the xGChain metric is primitive. It seems obvious that it is incorrect to evaluate, within the same possession, a square pass in one’s own half between central defenders and a pre-assist pass in close proximity to the opponent’s goal with the same metric value.
Nevertheless, even incorporating such a metric as an additional layer to the pass map provides a more comprehensive picture.
Today’s more informative pass maps utilize additional layers of metrics based on Markov chains like xT, or metrics derived from machine learning such as OBV, VAEP, or other possession value models.
Example of a map with xT metric
You can read about what the xT metric entails in my previous article. In the pass map below, a brighter (lighter) node color corresponds to a higher aggregate value according to the xT metric. The map shows that among all the players, Ben White created the most danger through his passes.
However, in the coloring of the edges here, the values of any specific metric are not used. The color and thickness of the edges depend on the number of passes between the respective players. The more passes there are, the thicker the arrow and the brighter the green color.
Example of a map with the OBV metric (StatsBomb model)
Below is a visualization of StatsBomb’s updated passing map. This new version replaces the xGChain metric with their flagship OBV metric, which is calculated using a machine learning model.
The fundamental approach to reading the map remains unchanged. However, there’s a notable addition: threshold values for the metric at the nodes (5th and 95th percentiles) have been incorporated to facilitate easier interpretation of the map. One point of curiosity remains around how the color scale for passes between players is standardized.
It’s also important to note that when working with this map, it’s quite challenging to identify pairs of players with a high number of passes. This difficulty arises from the way the thickness of the edges is normalized. If one considers a relatively large sample of matches (5 seasons, top 5 leagues) and counts the number of successful passes between each pair of football players, then only 16 passes correspond to the 99th percentile. Based on this assessment, it appears that the upper limit is set excessively high. (For my own map, I indeed used the 99th percentile as a benchmark.)
Example map with PV metric (model from StatsPerform)
Below is a passing map based on a template from The Athletic. In this map, the PV (Possession Value) metric, developed by Stats Perform in 2019, is used as an additional color layer.
This metric is a direct competitor to OBV and VAEP and is also based on a machine learning model. Essentially, it evaluates how each action on the football field changes the probability of scoring and conceding a goal in the next 10 seconds of play. The difference in these probabilities is the final value of the metric (for instance, if a pass increases the probability of scoring by 5% and the probability of conceding by 1%, then PV = +0.04).
The color scale in this infographic is normalized for both nodes and edges. Instead of arrows, a gradient of transparency is used — the brighter part of the line is near the passing player and the darker part near the receiving player.
The map in question analyzes Barcelona’s passes in the latest El Clásico. Lopez and Cancelo stand out for the danger they create according to the PV metric. Additionally, in terms of passing effectiveness, the pairings of Lopez-Gavi and Lopez-Cancelo are notable (Lopez’s passes in both directions are colored green).
Another example of using a pass map is to analyze team formations and the efficiency of players’ passes in different positions over an extended period. The image below compares pass maps for all Premier League clubs in the 2022–2023 season, focusing on the most frequently used formation throughout the season. The result is quite an “averaged” picture, but it can still be useful for high-level analysis.
I haven’t delved into the details of constructing this visualization and haven’t verified the results, but the map for Chelsea looks intriguing. According to the author, the most common formation and lineup were observed for only 251 minutes of gameplay during the season (considering minutes until the first substitution).
Considering that in the 2022–23 season, each team played an average of 3578 clean playing minutes (94 minutes per match), and the first substitution was typically made around the 62nd minute, there would be 66% of 3578 minutes or approximately 2362 minutes left for analyzing the pass map according to the starting formation. This means that Chelsea’s most stable lineup and formation accounted for only about 10% of the playing time (rough estimation). The team underwent significant rotation and structural optimization throughout the season, which is unsurprising given that three different coaches led Chelsea that season.
On the other hand, Fulham, Newcastle, and Arsenal stand out as bastions of stability.
Building your own Passing map
I decided to use a template from The Athletic as a basis to create a similar passing map for the Barcelona vs. Real Madrid match. For data sourcing, I used Opta data.
Below is the resulting map for Barcelona’s passes, compared with the original template.
A few comments:
Instead of a color gradient, I decided to use arrows. In my opinion, this method is more representative.
As the minimum value for the number of passes between players, I used a threshold of 5. This lower limit is necessary to avoid cluttering the graph, as mentioned earlier. For the maximum threshold, I used 16 passes. This value corresponds to the 99th percentile for the distribution of all successful passes over the last five years among the top 5 European leagues.
To normalize the upper limit of the number of successful passes made by a single player, I also used the 99th percentile, which corresponds to a value of 88 passes.
Additionally, I made a wider range of sizes for the nodes. On the original map from The Athletic, one might get the impression that Ferran Torres and Gavi made approximately the same number of passes, but in fact, Gavi had twice as many.
For the additional color layer, I used the open-play xT metric. Individual aggregate scores for each player are normalized by the 5th and 95th percentiles (the metric values themselves were obtained based on the transition matrix from the article).
Pairwise assessments (the color of the arrows), reflecting the total OP xT for passes from player A to player B, are capped by the 95th percentile, equal to 0.09.
Another important addition: my map presents the scheme obtained from median values of coordinates corresponding to passes from the 1st to the 60th minute (until the first substitution). However, when calculating the total number of passes and the total created danger from open play (OP xT), the entire match is considered.
Visualization authors usually don’t explain this point. If several maps are considered for one match, then it’s obvious that actual statistics should be used for each time interval. But if only one map is used, intended to describe the main pattern of play during the match, they might proceed as I have implemented.
Differences in metric coloration are expected. xT assesses players who make many successful passes from ‘non-dangerous zones’ to areas with a high probability of scoring in a few subsequent actions.
On the other hand, PV evaluates players regardless of their position on the football field, relying on the game context, which for this model is derived from the last 10 actions performed before the current play episode.
Below is a similar map, but for both teams.
For comparison, I also included a similar map from the Twitter channel markstats. Overall, the maps are quite similar in terms of player positioning, but there are some differences. For instance, if you look at Jude Bellingham, you can see a higher position on the lower graph. It’s likely that the author uses average coordinate values.
A pass map is another useful tool in football analytics. This visualization can be used in the context of pre-match analysis of opponents.
By analyzing the pass maps from the last few matches, teams can identify players who require close marking or increased pressure during pressing. It also helps in pinpointing player combinations that create the most danger, with the aim of disrupting these connections.
Additionally, such a map can be valuable for assessing changes in a team’s formation depending on the availability of certain key players.
While this visualization certainly cannot replace watching matches, it can assist video analysts in narrowing down their search for matches or specific game segments in preparation for an opponent.
And of course, such a map can also be interesting to general football fans looking to gain a deeper understanding of how a particular team’s play is structured.
P.s.
Detailed example with code:
Jupyter notebook: https://github.com/hadjdeh/football-data-analysis/tree/main/Pass_map