An Introduction to R - Using StatsBombR

Hi Everyone,

This week we are stepping back in time a bit, returning to the origins of my blog and showcasing some R. We will be walking through an Introduction to R, focussing on StatsBombR. Those of you who have seen my blogs before, are probably thinking, you have done this already! That’s a fair point, with my first blog actually being on how to install and use the StatsBombR package to create summary tables, which you can find here.

If you aren’t one for reading, then scroll to the bottom of the page to find the video were I discuss all the concepts in this blog.

\[\\[0.1in]\]

What is R?

R is an open source software that is used worldwide, predominantly for statistical analyses. The real benefit to R is the open source nature, which allows anyone to create packages that can help to expand the usage and usability for all users. For example, the tidyverse is an example of such a package, which has expanded R with improved features for many tasks that you might do often.

What do you need?

To work with R, you need to download the R software. You can get this from CRAN here. This will give you the base R that you can use in command line interfaces. To really make R easy to work with, you also need R Studio which gives you a user interface and allows you to interact and visualise your data easier.

Installing packages

To really power up your R, you need to install packages on to your system. These are really easy to do using a simple command such as this:

# ignore the hashtage when using in your scripts
# install.packages("tidyverse")

This will install the tidyverse package on your system. Some other helpful packages are: - ggplot2 - here - ggsoccer - ggrepel - devtools - lubridate - purrr

All of these packages have specific uses, and I recommend searching them via google for how to use and some use cases.

Using packages

Now you have the package installed, you need to load the package to use it easily. You have two options for this. First, you can load the packages through simple statements and the top of your script like this:

# ignore the hashtage when using in your scripts
# library(tidyverse)

But you can also do this “inline” using the following:

# ignore the hashtage when using in your scripts 
# tidyverse::read_csv()

Both methods will work, but you would typically only use the second option when you want to override a function that is common across packages. So keep this in mind.

StatsbombR package

The StatsBombR package is slightly different to the above method. The package is available via GitHub here. This will provide an overview of how to install the package, but definitely watch the video below to see a full walkthrough.

You will need to install the following one after the other on newer versions of R:

# ignore the hashtage when using in your scripts
# devtools::install_github("jjvanderwal/SDMTools")
# devtools::install_github("statsbomb/StatsBombR")

SDMTools is no longer maintained so needs to be installed via the above before installing StatsBombR so you don’t get an error message.

Using StatsBombR

Once you have the StatsBomb package loaded, you can now easily access their free datasets and have fun analysing event data from women’s and men’s football. To see how to access this data, make sure you watch the video below.

Video

You can find the full video showing the information discussed above, in a little more detail, below.

As always, hit like on the video and subscribe here for more videos to help you Power Performance Through Data.

Until next time,

Josh

comments powered by Disqus

Related