The Data

World Rugby has information on every international match available here, however a little digging reveals that they actually pull the data in from a publicly accessible but entirely undocumented API. I wrote some Python code to allow me to scrape all the information possible from that API, which you can find in my PyRugby package.

With all the data scraped and loaded into a local PostgreSQL database I decided to start by looking at when each nation was first represented internationally.

Some points to note about this data:

I took geo data from Natural Earth and created a custom set using sovereign nations where possible but incorporating dependencies and subdivisions where necessary. I computed centroids for this data using mapshaper. You can find the geo data in my map-data repo.

The Results

I initially wanted to produce an interactive Plotly chart with a slider for each year, however I found that using the choroplethmapbox trace type incorporating more than a handful of years created unfeasibly large files.

As a result I decided to create png images of every year and stitch them together into the a short video using iMovie.

I also created an interactive chart of the situation as it stands in 2020 (so no info on USSR etc) which you can find here