Extracting Freeze Frames

In my first post, I provided a tutorial on importing StatsBomb data in to R using the StatsBombR package. We took that imported data and created a few summary tables for the FA Women’s Super League. Today we are going to take this a step further and extract the shot freeze frame data provided.

You will have noticed last time, we removed the shot.freeze_frame column from our dataset so we could write the CSV summary tables. This was an important step as the shot freeze frame is provided as a nested dataframe. This means that there is a dataframe nested within that cell of the column.

Here is an example of the data nested within the shot.freeze frame_column for a single shot.

location teammate player.id player.name position.id position.name
c(99, 55) TRUE 15559 Simone Magill 22 Right Center Forward
c(93, 55) FALSE 19501 Hayley Ladd 11 Left Defensive Midfield
c(80, 55) TRUE 15573 Abbey-Leigh Stringer 9 Right Defensive Midfield
c(101, 55) FALSE 15569 Kerys Harrop 5 Left Center Back
c(101, 60) FALSE 15567 Paige Williams 6 Left Back
c(99, 47) TRUE 15579 Inessa Kaagman 19 Center Attacking Midfield
c(103, 53) FALSE 19503 Aoife Mannion 3 Right Center Back
c(102, 44) FALSE 19502 Meaghan Sargeant 2 Right Back
c(120, 41) FALSE 22032 Hannah Hampton 1 Goalkeeper
c(101, 59) TRUE 15556 Chantelle Boye-Hlorkah 24 Left Center Forward
c(82, 46) TRUE 15577 Angharad James 11 Left Defensive Midfield
c(90, 37) FALSE 10193 Chloe Arthur 9 Right Defensive Midfield
a Table 1. A summary of data extracted from the freeze frame column

As we can see, the freeze frame provides some valuable information on player locations at the time of the shot. From this we could see how many players are in front or behind of the ball, does the player have a clear shot and so on. This was a simple extraction using tidyr::unnest. However, we can run in to problems if there is a null value within the column, where no freeze frame positional data is provided. We will need to filter this row out before we can unnest the data. We would do that as follows:

### I have read in all data previously using StatsBombFreeEvents

FreezeFrameData <- Data %>% 
  filter(type.name == "Shot") %>% 
  select(minute, second, shot.outcome.name, shot.freeze_frame)

FreezeFrame <- FreezeFrameData %>% 
  filter(!map_lgl(shot.freeze_frame, is.null)) %>% 

Using purrr and “map_lgl” I can filter out the null values from the shot freeze frame column. From there I can then unnest all the data in to separate rows. Using this filterd and unnested data you can now plot or calculate player density at the time of the shot.

I hope this helps you examine the free StatsBomb data in more detail.

Josh Trewin
Josh Trewin
Data Scientist

I’m a data scientist, learning my way through R / Python and applying to football data from StatsBomb, provided for free through GitHub. Follow my journey on here or Twitter to find out when I add new content.

comments powered by Disqus