PS - You need to have the CSV file uploaded in the jupyter notebook. Here is the link to the dataset - https://www.kaggle.com/greeshmagirish/crime-against-women-20012014-india
This project is part of my Data Analysis with Python: Zero to Pandas course - www.zerotopandas.com.
# Uncomment and run the commands below if imports fail
# !pip install matplotlib --upgrade --quiet
!pip install jovian --upgrade --quiet pip
import jovian
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women
!pip install pandas --upgrade
!pip install matplotlib
!pip install seaborn
Requirement already satisfied: pandas in /opt/conda/lib/python3.7/site-packages (1.1.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages (from pandas) (2.8.1)
Requirement already satisfied: numpy>=1.15.4 in /opt/conda/lib/python3.7/site-packages (from pandas) (1.18.5)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.7/site-packages (from pandas) (2019.3)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.14.0)
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.7/site-packages (3.2.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: numpy>=1.11 in /opt/conda/lib/python3.7/site-packages (from matplotlib) (1.18.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib) (1.2.0)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib) (2.8.1)
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from cycler>=0.10->matplotlib) (1.14.0)
Requirement already satisfied: seaborn in /opt/conda/lib/python3.7/site-packages (0.10.0)
Requirement already satisfied: matplotlib>=2.1.2 in /opt/conda/lib/python3.7/site-packages (from seaborn) (3.2.1)
Requirement already satisfied: pandas>=0.22.0 in /opt/conda/lib/python3.7/site-packages (from seaborn) (1.1.5)
Requirement already satisfied: numpy>=1.13.3 in /opt/conda/lib/python3.7/site-packages (from seaborn) (1.18.5)
Requirement already satisfied: scipy>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from seaborn) (1.4.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (2.8.1)
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from cycler>=0.10->matplotlib>=2.1.2->seaborn) (1.14.0)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.7/site-packages (from pandas>=0.22.0->seaborn) (2019.3)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
crimes_df = pd.read_csv('../input/crime-against-women-20012014-india/crimes_against_women_2001-2014.csv')
crimes_df
crimes_df.shape
(10677, 11)
overall_crime = crimes_df.isna().sum()
overall_crime
Unnamed: 0 0
STATE/UT 0
DISTRICT 0
Year 0
Rape 0
Kidnapping and Abduction 0
Dowry Deaths 0
Assault on women with intent to outrage her modesty 0
Insult to modesty of Women 0
Cruelty by Husband or his Relatives 0
Importation of Girls 0
dtype: int64
districts = len(crimes_df.DISTRICT.unique())
districts
1605
crimes_df.drop(['DISTRICT', 'Unnamed: 0'], axis = 1, inplace=True)
crimes_df.rename( columns = {'Kidnapping and Abduction':'Kidnapping_Abduction','Dowry Deaths':'Dowry_Deaths',
'Assault on women with intent to outrage her modesty':'Hurting_of_womens_modesty',
'Insult to modesty of Women':'Insult_to_womens_modesty',
'Cruelty by Husband or his Relatives':'Domestic_Cruelty',
'Importation of Girls':'Importation_of_Girls'}, inplace = True)
crimes_df
print(crimes_df['STATE/UT'].unique())
['ANDHRA PRADESH' 'ARUNACHAL PRADESH' 'ASSAM' 'BIHAR' 'CHHATTISGARH' 'GOA'
'GUJARAT' 'HARYANA' 'HIMACHAL PRADESH' 'JAMMU & KASHMIR' 'JHARKHAND'
'KARNATAKA' 'KERALA' 'MADHYA PRADESH' 'MAHARASHTRA' 'MANIPUR' 'MEGHALAYA'
'MIZORAM' 'NAGALAND' 'ODISHA' 'PUNJAB' 'RAJASTHAN' 'SIKKIM' 'TAMIL NADU'
'TRIPURA' 'UTTAR PRADESH' 'UTTARAKHAND' 'WEST BENGAL' 'A & N ISLANDS'
'CHANDIGARH' 'D & N HAVELI' 'DAMAN & DIU' 'DELHI' 'LAKSHADWEEP'
'PUDUCHERRY' 'Andhra Pradesh' 'Arunachal Pradesh' 'Assam' 'Bihar'
'Chhattisgarh' 'Goa' 'Gujarat' 'Haryana' 'Himachal Pradesh'
'Jammu & Kashmir' 'Jharkhand' 'Karnataka' 'Kerala' 'Madhya Pradesh'
'Maharashtra' 'Manipur' 'Meghalaya' 'Mizoram' 'Nagaland' 'Odisha'
'Punjab' 'Rajasthan' 'Sikkim' 'Tamil Nadu' 'Tripura' 'Uttar Pradesh'
'Uttarakhand' 'West Bengal' 'A&N Islands' 'Chandigarh' 'D&N Haveli'
'Daman & Diu' 'Delhi UT' 'Lakshadweep' 'Puducherry' 'Telangana'
'A & N Islands']
# Fist we will remove all the repeated uppercase values
def remove_uppercase(r):
r = r['STATE/UT'].strip()
r = r.upper()
return r
crimes_df['STATE/UT'] = crimes_df.apply(remove_uppercase, axis=1)
#Now use replace function to replace the other type of repeated datas as dicussed above
crimes_df['STATE/UT'].replace("A&N ISLANDS", "A & N ISLANDS", inplace = True)
crimes_df['STATE/UT'].replace("D&N HAVELI", "D & N HAVELI", inplace = True)
crimes_df['STATE/UT'].replace("DELHI UT", "DELHI", inplace = True)
crimes_df['STATE/UT'].unique()
array(['ANDHRA PRADESH', 'ARUNACHAL PRADESH', 'ASSAM', 'BIHAR',
'CHHATTISGARH', 'GOA', 'GUJARAT', 'HARYANA', 'HIMACHAL PRADESH',
'JAMMU & KASHMIR', 'JHARKHAND', 'KARNATAKA', 'KERALA',
'MADHYA PRADESH', 'MAHARASHTRA', 'MANIPUR', 'MEGHALAYA', 'MIZORAM',
'NAGALAND', 'ODISHA', 'PUNJAB', 'RAJASTHAN', 'SIKKIM',
'TAMIL NADU', 'TRIPURA', 'UTTAR PRADESH', 'UTTARAKHAND',
'WEST BENGAL', 'A & N ISLANDS', 'CHANDIGARH', 'D & N HAVELI',
'DAMAN & DIU', 'DELHI', 'LAKSHADWEEP', 'PUDUCHERRY', 'TELANGANA'],
dtype=object)
len(crimes_df['STATE/UT'].unique())
36
import jovian
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women
victims_raped = crimes_df.Rape.sum()
victims_kidnapped_abducted = crimes_df.Kidnapping_Abduction.sum()
dowery_death = crimes_df.Dowry_Deaths.sum()
modesty_assault = crimes_df.Hurting_of_womens_modesty.sum()
insult_to_modesty = crimes_df.Insult_to_womens_modesty.sum()
domestic_violence = crimes_df.Domestic_Cruelty.sum()
girls_imported = crimes_df.Importation_of_Girls.sum()
total_population_of_victim_overall = victims_raped + victims_raped + dowery_death +modesty_assault+ insult_to_modesty + domestic_violence+ girls_imported
total_population_of_victim_overall
5194570
Note - For our ease, we are doing the analysis for six caterories, excluding the "Insult_to_modesty_of_Women" column.
fig, axes = plt.subplots(2, 3, figsize=(25, 12))
axes[0,0].plot(crimes_df.Year, crimes_df.Rape, 's-b')
axes[0,0].plot(crimes_df.Year, crimes_df.Dowry_Deaths, 'o--r')
axes[0,0].set_xlabel('Year')
axes[0,0].set_ylabel('Yield (tons per hectare)')
axes[0,0].legend(['Apples', 'Oranges']);
axes[0,0].set_title('Crop Yields in Kanto')
axes[0,1].set_title("Chart of Kidnapping and Abduction cases in India in 2001-2014")
axes[0,1].bar(crimes_df.Year, crimes_df.Kidnapping_Abduction, color = 'violet');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Kidnapping and Abduction in India') #Y-axis
axes[0,2].set_title("Chart of Dowry death cases in India in 2001-2014")
axes[0,2].bar(crimes_df.Year, crimes_df.Dowry_Deaths, color = 'navy');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Dowry deaths in India') #Y-axis
axes[1,0].set_title("Chart of Assault to her modesty in 2001-2014")
axes[1,0].bar(crimes_df.Year, crimes_df.Hurting_of_womens_modesty, color = 'cyan');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Assaulting a women for her modesty in India') #Y-axis
axes[1,1].set_title("Chart of Domestic Violence cases in India in 2001-2014")
axes[1,1].bar(crimes_df.Year, crimes_df.Domestic_Cruelty, color = 'orange');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases of Domestic Violence in India') #Y-axis
axes[1,2].set_title("Chart of Importation of girls in India in 2001-2014")
axes[1,2].bar(crimes_df.Year, crimes_df.Importation_of_Girls, color = 'red');
plt.xlabel('Year') #X-axis
plt.ylabel('Cases ofImportation of girls in India') #Y-axis
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-76-4177d7262605> in <module>
20
21 axes[1,0].set_title("Chart of Assault to her modesty in 2001-2014")
---> 22 axes[1,0].bar(crimes_df.Year, crimes_df.Hurting_of_womens_modesty, color = 'cyan');
23 plt.xlabel('Year') #X-axis
24 plt.ylabel('Cases of Assaulting a women for her modesty in India') #Y-axis
/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
5139 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5140 return self[name]
-> 5141 return object.__getattribute__(self, name)
5142
5143 def __setattr__(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'Hurting_of_womens_modesty'
count_df = crimes_df.groupby('Year')[['STATE/UT']].count()
count_df
plt.figure(figsize=(10,5))
plt.title("Total cases reported from every state Year wise")
sns.heatmap(count_df);
import jovian
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women
crimes_df = crimes_df.drop(['Hurting_of_womens_modesty', 'Insult_to_womens_modesty'], axis=1)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-59-271d8d4c934e> in <module>
----> 1 crimes_df = crimes_df.drop(['Hurting_of_womens_modesty', 'Insult_to_womens_modesty'], axis=1)
/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
4172 level=level,
4173 inplace=inplace,
-> 4174 errors=errors,
4175 )
4176
/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3887 for axis, labels in axes.items():
3888 if labels is not None:
-> 3889 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
3890
3891 if inplace:
/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
3921 new_axis = axis.drop(labels, level=level, errors=errors)
3922 else:
-> 3923 new_axis = axis.drop(labels, errors=errors)
3924 result = self.reindex(**{axis_name: new_axis})
3925
/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
5285 if mask.any():
5286 if errors != "ignore":
-> 5287 raise KeyError(f"{labels[mask]} not found in axis")
5288 indexer = indexer[~mask]
5289 return self.delete(indexer)
KeyError: "['Hurting_of_womens_modesty' 'Insult_to_womens_modesty'] not found in axis"
max_rape_cases = crimes_df.sort_values('Rape', ascending = False).head(10)
max_rape_cases
max_dowry_death_cases = crimes_df.sort_values('Dowry_Deaths', ascending = False).head(10)
max_dowry_death_cases
max_domestic_violance_cases = crimes_df.sort_values('Domestic_Cruelty', ascending = False).head(10)
max_domestic_violance_cases
max_importation_case = crimes_df.sort_values('Importation_of_Girls', ascending = False).head(10)
max_importation_case
counts_df = crimes_df.groupby('STATE/UT')[['Rape', 'Kidnapping_Abduction', 'Dowry_Deaths','Domestic_Cruelty', 'Importation_of_Girls']].sum()
counts_df
counts_df.sort_values(by = 'Rape', ascending = False).head(5)
counts_df.sort_values(by = 'Kidnapping_Abduction', ascending = False).head(5)
counts_df.sort_values(by = 'Dowry_Deaths', ascending = False).head(5)
counts_df.sort_values(by = 'Domestic_Cruelty', ascending = False).head(5)
counts_df.sort_values(by = 'Importation_of_Girls', ascending = False).head(5)
max_importation_case = max_importation_case.merge(max_rape_cases)
max_importation_case
max_dowry_death_cases = max_dowry_death_cases.merge(max_rape_cases)
max_dowry_death_cases
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women
1)More than 5 million females has been a victim of some or other type of Violance, based on their gender,
starting from rape to importing them for buisness.
2) We concluded from the series of bar graphs that 2014 was the year, when crimes were reported the highest
under each category.
3) We tried finding out the top 10 highest cases reported ever, along with year in which has been reported
and in which state.Where, Madhya Pradesh having highest number of cases of rape in 2014, Uttar Pradesh having
highest cases in Dowry death in 2014, West Bengal having highest cases in Domestic Violance in 2014 and Bihar
having the highest cases in importaion of girls in 2011.
4)We summarised the TOTAL number of cases happening, in 2001-2014, by each state.
5)We also found out the top 5 states where maximum number of TOTAL cases has been reported from 2001-2014,
state wise.
6)We also merged the data in two different cases, first one being "Maximum number of rape cases" and
"Maximum number of Importation cases", where we found out it is Madhya Pradesh and in second one being
"Maximum number of rape cases" and "Maximum number of Deaths due to Dowry cases", which we concluded to
be Uttar Pradesh.
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women
I want to work more on the topic of Women's safety in our society and also would do the analysis on same type of datset, but that will not be bound to any specific country but whole world in general!
import jovian
jovian.commit(project='crime-against-women', environment=None)
[jovian] Attempting to save notebook..
[jovian] Detected Kaggle notebook...
[jovian] Uploading notebook to https://jovian.ai/sathi-satb/crime-against-women