PS - You need to have the CSV file uploaded in the jupyter notebook. Here is the link to the dataset - https://www.kaggle.com/greeshmagirish/crime-against-women-20012014-india
This project is part of my Data Analysis with Python: Zero to Pandas course - www.zerotopandas.com.
import jovian
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
!pip install pandas
!pip install matplotlib
!pip install seaborn
Requirement already satisfied: pandas in /srv/conda/envs/notebook/lib/python3.7/site-packages (1.1.2)
Requirement already satisfied: numpy>=1.15.4 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (1.19.2)
Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (2020.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (2.8.1)
Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Requirement already satisfied: matplotlib in /srv/conda/envs/notebook/lib/python3.7/site-packages (3.3.2)
Requirement already satisfied: numpy>=1.15 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (1.19.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (1.2.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (2.4.7)
Requirement already satisfied: certifi>=2020.06.20 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (2020.6.20)
Requirement already satisfied: python-dateutil>=2.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: pillow>=6.2.0 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib) (7.2.0)
Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from python-dateutil>=2.1->matplotlib) (1.15.0)
Requirement already satisfied: seaborn in /srv/conda/envs/notebook/lib/python3.7/site-packages (0.11.0)
Requirement already satisfied: matplotlib>=2.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (3.3.2)
Requirement already satisfied: pandas>=0.23 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (1.1.2)
Requirement already satisfied: numpy>=1.15 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (1.19.2)
Requirement already satisfied: scipy>=1.0 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (1.5.2)
Requirement already satisfied: python-dateutil>=2.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (2.8.1)
Requirement already satisfied: pillow>=6.2.0 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (7.2.0)
Requirement already satisfied: cycler>=0.10 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (1.2.0)
Requirement already satisfied: certifi>=2020.06.20 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn) (2020.6.20)
Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas>=0.23->seaborn) (2020.1)
Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
jovian.commit('crimes_against_women.csv')
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
crimes_df = pd.read_csv('crimes_against_women.csv')
crimes_df
crimes_df.shape
(10677, 11)
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
overall_crime = crimes_df.isna().sum()
overall_crime
Unnamed: 0 0
STATE/UT 0
DISTRICT 0
Year 0
Rape 0
Kidnapping_Abduction 0
Dowry_Deaths 0
Assault_for_her_modesty 0
Insult_to_modesty_of_Women 0
Domestic_violence 0
Importation_of_Girls 0
dtype: int64
districts = len(crimes_df.DISTRICT.unique())
districts
1605
crimes_df.drop(['DISTRICT', 'Unnamed: 0'], axis = 1, inplace=True)
crimes_df
print(crimes_df['STATE/UT'].unique())
['ANDHRA PRADESH' 'ARUNACHAL PRADESH' 'ASSAM' 'BIHAR' 'CHHATTISGARH' 'GOA'
'GUJARAT' 'HARYANA' 'HIMACHAL PRADESH' 'JAMMU & KASHMIR' 'JHARKHAND'
'KARNATAKA' 'KERALA' 'MADHYA PRADESH' 'MAHARASHTRA' 'MANIPUR' 'MEGHALAYA'
'MIZORAM' 'NAGALAND' 'ODISHA' 'PUNJAB' 'RAJASTHAN' 'SIKKIM' 'TAMIL NADU'
'TRIPURA' 'UTTAR PRADESH' 'UTTARAKHAND' 'WEST BENGAL' 'A & N ISLANDS'
'CHANDIGARH' 'D & N HAVELI' 'DAMAN & DIU' 'DELHI' 'LAKSHADWEEP'
'PUDUCHERRY' 'Andhra Pradesh' 'Arunachal Pradesh' 'Assam' 'Bihar'
'Chhattisgarh' 'Goa' 'Gujarat' 'Haryana' 'Himachal Pradesh'
'Jammu & Kashmir' 'Jharkhand' 'Karnataka' 'Kerala' 'Madhya Pradesh'
'Maharashtra' 'Manipur' 'Meghalaya' 'Mizoram' 'Nagaland' 'Odisha'
'Punjab' 'Rajasthan' 'Sikkim' 'Tamil Nadu' 'Tripura' 'Uttar Pradesh'
'Uttarakhand' 'West Bengal' 'A&N Islands' 'Chandigarh' 'D&N Haveli'
'Daman & Diu' 'Delhi UT' 'Lakshadweep' 'Puducherry' 'Telangana'
'A & N Islands']
# Fist we will remove all the repeated uppercase values
def remove_uppercase(r):
r = r['STATE/UT'].strip()
r = r.upper()
return r
crimes_df['STATE/UT'] = crimes_df.apply(remove_uppercase, axis=1)
#Now use replace function to replace the other type of repeated datas as dicussed above
crimes_df['STATE/UT'].replace("A&N ISLANDS", "A & N ISLANDS", inplace = True)
crimes_df['STATE/UT'].replace("D&N HAVELI", "D & N HAVELI", inplace = True)
crimes_df['STATE/UT'].replace("DELHI UT", "DELHI", inplace = True)
crimes_df['STATE/UT'].unique()
array(['ANDHRA PRADESH', 'ARUNACHAL PRADESH', 'ASSAM', 'BIHAR',
'CHHATTISGARH', 'GOA', 'GUJARAT', 'HARYANA', 'HIMACHAL PRADESH',
'JAMMU & KASHMIR', 'JHARKHAND', 'KARNATAKA', 'KERALA',
'MADHYA PRADESH', 'MAHARASHTRA', 'MANIPUR', 'MEGHALAYA', 'MIZORAM',
'NAGALAND', 'ODISHA', 'PUNJAB', 'RAJASTHAN', 'SIKKIM',
'TAMIL NADU', 'TRIPURA', 'UTTAR PRADESH', 'UTTARAKHAND',
'WEST BENGAL', 'A & N ISLANDS', 'CHANDIGARH', 'D & N HAVELI',
'DAMAN & DIU', 'DELHI', 'LAKSHADWEEP', 'PUDUCHERRY', 'TELANGANA'],
dtype=object)
len(crimes_df['STATE/UT'].unique())
36
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
victims_raped = crimes_df.Rape.sum()
victims_kidnapped_abducted = crimes_df.Kidnapping_Abduction.sum()
dowery_death = crimes_df.Dowry_Deaths.sum()
modesty_assault = crimes_df.Assault_for_her_modesty.sum()
insult_to_modesty = crimes_df.Insult_to_modesty_of_Women.sum()
domestic_violence = crimes_df.Domestic_violence.sum()
girls_imported = crimes_df.Importation_of_Girls.sum()
total_population_of_victim_overall = victims_raped + victims_raped + dowery_death +modesty_assault+ insult_to_modesty + domestic_violence+ girls_imported
total_population_of_victim_overall
5194570
Note - For our ease, we are doing the analysis for six caterories, excluding the "Insult_to_modesty_of_Women" column.
fig, axes = plt.subplots(2, 3, figsize=(25, 12))
axes[0,0].set_title("Chart of rape cases in India in 2001-2014")
axes[0,0].bar(crimes_df.Year, crimes_df.Rape, color = 'black');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Rape in India') #Y-axis
axes[0,1].set_title("Chart of Kidnapping and Abduction cases in India in 2001-2014")
axes[0,1].bar(crimes_df.Year, crimes_df.Kidnapping_Abduction, color = 'violet');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Kidnapping and Abduction in India') #Y-axis
axes[0,2].set_title("Chart of Dowry death cases in India in 2001-2014")
axes[0,2].bar(crimes_df.Year, crimes_df.Dowry_Deaths, color = 'navy');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Dowry deaths in India') #Y-axis
axes[1,0].set_title("Chart of Assault to her modesty in 2001-2014")
axes[1,0].bar(crimes_df.Year, crimes_df.Assault_for_her_modesty, color = 'cyan');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Assaulting a women for her modesty in India') #Y-axis
axes[1,1].set_title("Chart of Domestic Violence cases in India in 2001-2014")
axes[1,1].bar(crimes_df.Year, crimes_df.Domestic_violence, color = 'orange');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Domestic Violance in India') #Y-axis
axes[1,2].set_title("Chart of Importation of girls in India in 2001-2014")
axes[1,2].bar(crimes_df.Year, crimes_df.Domestic_violence, color = 'red');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases ofImportation of girls in India') #Y-axis
Text(0, 0.5, 'Cases ofImportation of girls in India')
count_df = crimes_df.groupby('Year')[['STATE/UT']].count()
count_df
plt.figure(figsize=(10,5))
plt.title("Total cases reported from every state Year wise")
sns.heatmap(count_df);
count_df.mean()
STATE/UT 762.642857
dtype: float64
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
crimes_df = crimes_df.drop(['Assault_for_her_modesty', 'Insult_to_modesty_of_Women'], axis=1)
max_rape_cases = crimes_df.sort_values('Rape', ascending = False).head(10)
max_rape_cases
max_dowry_death_cases = crimes_df.sort_values('Dowry_Deaths', ascending = False).head(10)
max_dowry_death_cases
max_domestic_violance_cases = crimes_df.sort_values('Domestic_violence', ascending = False).head(10)
max_domestic_violance_cases
max_importation_case = crimes_df.sort_values('Importation_of_Girls', ascending = False).head(10)
max_importation_case
counts_df = crimes_df.groupby('STATE/UT')[['Rape', 'Kidnapping_Abduction', 'Dowry_Deaths','Domestic_violence', 'Importation_of_Girls']].sum()
counts_df
counts_df.sort_values(by = 'Rape', ascending = False).head(5)
counts_df.sort_values(by = 'Kidnapping_Abduction', ascending = False).head(5)
counts_df.sort_values(by = 'Dowry_Deaths', ascending = False).head(5)
counts_df.sort_values(by = 'Domestic_violence', ascending = False).head(5)
counts_df.sort_values(by = 'Importation_of_Girls', ascending = False).head(5)
max_importation_case = max_importation_case.merge(max_rape_cases)
max_importation_case
max_dowry_death_cases = max_dowry_death_cases.merge(max_rape_cases)
max_dowry_death_cases
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
1)More than 5 million females has been a victim of some or other type of Violance, based on their gender,
starting from rape to importing them for buisness.
2) We concluded from the series of bar graphs that 2014 was the year, when crimes were reported the highest
under each category.
3) We tried finding out the top 10 highest cases reported ever, along with year in which has been reported
and in which state.Where, Madhya Pradesh having highest number of cases of rape in 2014, Uttar Pradesh having
highest cases in Dowry death in 2014, West Bengal having highest cases in Domestic Violance in 2014 and Bihar
having the highest cases in importaion of girls in 2011.
4)We summarised the TOTAL number of cases happening, in 2001-2014, by each state.
5)We also found out the top 5 states where maximum number of TOTAL cases has been reported from 2001-2014,
state wise.
6)We also merged the data in two different cases, first one being "Maximum number of rape cases" and
"Maximum number of Importation cases", where we found out it is Madhya Pradesh and in second one being
"Maximum number of rape cases" and "Maximum number of Deaths due to Dowry cases", which we concluded to
be Uttar Pradesh.
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women
I want to work more on the topic of Women's safety in our society and also would do the analysis on same type of datset, but that will not be bound to any specific country but whole world in general!
import jovian
jovian.commit()
[jovian] Attempting to save notebook..
[jovian] Updating notebook "sathi-satb/crime-against-women" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/sathi-satb/crime-against-women