Learn practical skills, build real-world projects, and advance your career

Analysing suicide data of India

There have been cases of suicides across India because of various reasons like mental stress, depression, dowry, loan debts, and a lot more. To understand what are the causes behind taking such a decision, what is its scenario among different age groups, which state is at the top in case of suicide rates, etc., I am going to analyse a data set which contains yearly suicide information of all the states and union territories of India. I got this data on Kaggle but it is actually shared by National Crime Records Bureau, Govt. of India (NCRB) under government open data license - India. It contains data of 12 years i.e., from 2001 to 2012. It is my last project of the course "Data Analysis with Python : Zero to Pandas". So, to analyze this data, I am going to use all the techniques which I have learnt in the time period of this course like pandas, numpy, seaborn, and a lot more interesting concepts so that one can understand it smoothly. It's been really a great learning journey for me ^-^. All thanks to akash sir and his team :) as it won't be possible with their flexible deadlines understanding the problems of students, encouraging and motivating them to complete their work/assignment with almost zero pressure.

Without delaying now let's get started with the project.

Downloading the Dataset

I have chosen 'Suicides in India' data set from kaggle for EDA. It contains 2,37,519 rows and 7 columns. Suicide cases in each state is classified according to some parameter like socio-status, profession of that individual, etc.

!pip install jovian opendatasets --upgrade --quiet

Let's begin by downloading the data, and listing the files within the dataset.