EDA on IMDB most popular Films and series
IMDB is a popular website for rating films and series. I always go there if I want to watch something new, and many users trust its rankings. The data is about the most famous 6k+ Films and series on IMDB with rates. The Data is Perfect for Exploratory Data Analysis. I got this data from Kaggle.
I would like to do EDA. Want to find answers to some interesting questions. Todo, I am using Pandas and NumPy for data storage and processing, and for visualization, I am using Matplotlib and Seaborn.
One of the best sources I came across to do all the mentioned above is Data Analysis with Python: Zero to Pandas. I learned the basics of Python, Pandas, NumPy, and visualization tools like Matplotlid and Seaborn.
Downloading the Dataset
The dataset have 6178 rows and 14 columns. Column description given below.
Name: Name of the film/series
Data: Creation date
Rate: IMDB's Rate
Votes: Number of voters
Genre: Genres , Actions , Drama, Romance, etc…
Duration: Duration of the episode , film
Type: whether it's film or series
Certificate:
TV-Y: Designed to be appropriate for all children
TV-Y7: Suitable for ages 7 and up
G: Suitable for General Audiences
TV-G: Suitable for General Audiences
PG: Parental Guidance suggested
TV-PG: Parental Guidance suggested
PG-13: Parents strongly cautioned. May be Inappropriate for ages 12 and under.
TV-14: Parents strongly cautioned. May not be suitable for ages 14 and under.
R: Restricted. May be inappropriate for ages 17 and under.
TV-MA: For Mature Audiences. May not be suitable for ages 17 and under.
NC-17: Inappropriate for ages 17 and under
Episodes: Number of Episodes only for series
Nudity, Violence, Profanity, Alcohol, and Frightening :How much does it have of these
!pip install jovian opendatasets --upgrade --quiet
Let's begin by downloading the data, and listing the files within the dataset.
# Change this
dataset_url = 'https://www.kaggle.com/mazenramadan/imdb-most-popular-films-and-series'