Learn practical skills, build real-world projects, and advance your career

Netflix Data Analysis on Movies and TV Shows

About Netflix:

Netflix is the world's leading media streaming platform, operating in nearly every country in the world. It was one of the very first players in the streaming industry when it made the transition in 2007, and the bet has paid off with hundreds of millions of subscribers around the world.

Dataset :

Link: TV Shows and Movies Listed on Netflix Dataset from Kaggle.

This dataset consist of movies and tvshows available on Netflix.

Objective :

In this we are going to perform Some exploratory Data analysis to find some hidden trends and pattern in dataset. will going to load and read the data using pandas, do some cleaning and Processing of data and will explore the dataset through visualizations and graphs using matplotlib and seaborn and finally answers some questions related to dataset.

What is EDA and Why it is important ?

So first lets understand about EDA and why this stage plays an major role in Data Science Project lifecycle.

Exploratory Data Analysis is the process of exploring data, generating insights, testing hypotheses, checking assumptions and revealing underlying hidden patterns in the data.

There are no shortcuts in a Data Science project lifecycle. We can’t simply skip to the model building stage after gathering the data. We need to plan our approach in a structured manner and the exploratory data analysis (EDA) stage plays a huge part in that.

Many of us in this field specially beginners, couldn’t wait to dive into machine learning algorithms but that often left our end result hanging in the balance. I discovered, through personal experience and the advice of my mentors, the importance of spending time exploring and understanding the data beacuse it let us know every small points about our data.

Downloading the Dataset

Let's begin by downloading the data and listing the files within the dataset. here we are downloading the dataset using opendatasets Python library.

!pip install jovian opendatasets --upgrade --quiet
dataset_url = 'https://www.kaggle.com/shivamb/netflix-shows'