Learn practical skills, build real-world projects, and advance your career

NHL Statistics research and vizualization project

As I'm a big NHL fan I've decided to find and work with some statistical data from one of the biggest sport leagues in the World.
This dataset was downloaded from Kaggle from the "NHL play-by-play data" statistics set. I've taken statistics only for the last season 2019/2020. There is lots of data in this set from all games in the season, but I'll mostly will play with scoring statistics as this is most interesting in the most cases. I'll try to use all the tools that were walked through in this great course Data Analysis with Python: Zero to Pandas - Numpy, Pandas, Matplotlib, Seaborn.

Downloading the Dataset

First I will download all available datasets with play-by-play statistics for all seasons starting from 2007, but will delete all datasets manually after except for the last season one.

!pip install jovian opendatasets --upgrade --quiet

Let's begin by downloading the data, and listing the files within the dataset.

dataset_url = 'https://www.kaggle.com/s903124/nhl-playbyplay-data-from-2007'