Exploratory Data Analysis Using Python : `IPL (2008-2020)`

IPL (Indian Premier League) dataset 2008-2019 has beentaken from https://www.kaggle.com/patrickb1912/ipl-complete-dataset-20082020 .This dataset consists of two seperate CSV files for matches and deliveries. These files contain the information of each match summary and ball by ball details, respectively. For analysis both former and current teams have been considered and Hidden Insights has been found considering all the matches played till 2020.
alt
source

IPL is a professional Twenty20 cricket league in India usually contested between March and May of every year by eight teams representing eight different cities or states in India.The league was founded by the Board of Control for Cricket in India (BCCI) in 2007. To read more (view)

Current Teams : Chennai Super Kings, Delhi Capitals & Delhi Daredevils are same, Kings XI Punjab, Kolkata Knight Riders, Mumbai Indians, Rajasthan Royals, Royal Challengers Bangalore, Sunrisers Hyderabad.
alt
Source

Former teams : Deccan Chargers, Kochi Tuskers Kerala, Pune Warriors India, Rising Pune Supergiant, Gujarat Lions

These are some helper libraries in Python for data analysis and visualization.Learned from Data Analysis with Python: Zero to Pandas, the course have assignments dedicated for the practice along with the jupyter notebooks used while teaching for better understanding and experimentation.It also has has active and supportive community forums for clearing the queries. Good option for beginners to kick start.

`Table of Content`

`1. Downloading the Dataset`

Installing opndatasets helper library

Saving dataset url

Import opendatasets and operating system libraries

Downloading dataset

View downloaded files

2.`Data Preparation and Cleaning`

Import Pandas

Load IPL Ball-by-Ball 2008-2020.csv

Check datatype of the loaded dataset

View Columns

Filter unwanted columns

View 10 random rows

Load IPL Matches 2008-2020.csv

View Columns

View content of columns

change 'Delhi Daredevils' to its present name 'Delhi Capitals' and also there is typo in 'Rising Pune Supergiants' has extra 's' in last instead of 'Rising Pune Supergiant'

Dealing with NAN values

Creating dataframe with interested columns

Check info of the dataframe

Change datatype of date column from 'object' to 'date'

Add a weekday, month and year columns

Drop date column

Check shape of the prepared dataframe

Check 10 random rows of dataframe ready for analysis

`3.Exploratory Analysis and Visualization`

Import helper librabries matplotlib.pyplot, seaborn and numpy for visualization

Rename some columns

Check nunmber of matches won by individual team

Top 3 cities

Top 3 stadiums

Matches played in respective Year

Matches played in respective Months

`4. Asking and Answering Questions`

Win the Toss, Win the Match

king of boundaries

Luckiest Stadium for Mumbai Indians

Wicket King

Max matches won, CSK VS MI

Player of the Match

2020 : Top 10 Boundary rivals

`5. Hidden Insights`

`6. References`

`For analysis we'll take help of following libraries` :

opendasets

os

pandas

numpy

matplotlib

matplotlib.pyplot

seaborn

`1.Downloading the Dataset`

1.1 We'll use the opendatasets helper library to download the files.

!pip install jovian opendatasets --upgrade --quiet

Exploratory Data Analysis Using Python : IPL (2008-2020)

Table of Content

1. Downloading the Dataset

2.Data Preparation and Cleaning

3.Exploratory Analysis and Visualization

4. Asking and Answering Questions

5. Hidden Insights

6. References

For analysis we'll take help of following libraries :

1.Downloading the Dataset