Learn practical skills, build real-world projects, and advance your career
Created 4 years ago
# Exploratory Data Analysis I
Table of Contents
- Problem Statement
- Data Loading and Description
- Data Profiling
- 3.1 Understanding the Dataset
- 3.2 Pre Profiling
- 3.3 Preprocessing
- 3.4 Post Profiling
- 3.1 Understanding the Dataset
- Questions
- 4.1 Off all the passengers, how many survived and how many died?
- 4.2 Who is more likely to survive, Male or Female?
- 4.3 What is the rate of survival of males, females and child on the basis of Passenger Class?
- 4.4 What is the survival rate considering the Embarked variable?
- 4.5 Survival rate - Comparing Embarked and Sex.
- 4.6 How survival rate vary with Embarked, Sex and Pclass
- 4.7 Segment age in bins with size 10.
- 4.8 Analysing SibSp and Parch variable.
- 4.9 Segment fare in bins of size 12.
- 4.10 Draw pair plot to know the joint relationship between 'Fare','Age','Pclass' and 'Survived'
- 4.11 Establish coorelation between all the features using heatmap.
- 4.12 Hypothesis: Women and children are more likely to survive
- 4.1 Off all the passengers, how many survived and how many died?
- Conclusions
1. Problem Statement
The notebooks explores the basic use of Pandas and will cover the basic commands of Exploratory Data Analysis(EDA) which includes cleaning, munging, combining, reshaping, slicing, dicing, and transforming data for analysis purpose.
- Exploratory Data Analysis
Understand the data by EDA and derive simple models with Pandas as baseline.
EDA ia a critical and first step in analyzing the data and we do this for below reasons :- Finding patterns in Data
- Determining relationships in Data
- Checking of assumptions
- Preliminary selection of appropriate models
- Detection of mistakes