Analysing suicide data of India

There have been cases of suicides across India because of various reasons like mental stress, depression, dowry, loan debts, and a lot more. To understand what are the causes behind taking such a decision, what is its scenario among different age groups, which state is at the top in case of suicide rates, etc., I am going to analyse a data set which contains yearly suicide information of all the states and union territories of India. I got this data on Kaggle but it is actually shared by National Crime Records Bureau, Govt. of India (NCRB) under government open data license - India. It contains data of 12 years i.e., from 2001 to 2012. It is my last project of the course "Data Analysis with Python : Zero to Pandas". So, to analyze this data, I am going to use all the techniques which I have learnt in the time period of this course like pandas, numpy, seaborn, and a lot more interesting concepts so that one can understand it smoothly. It's been really a great learning journey for me ^-^. All thanks to akash sir and his team :) as it won't be possible with their flexible deadlines understanding the problems of students, encouraging and motivating them to complete their work/assignment with almost zero pressure.

Without delaying now let's get started with the project.

Downloading the Dataset

I have chosen 'Suicides in India' data set from kaggle for EDA. It contains 2,37,519 rows and 7 columns. Suicide cases in each state is classified according to some parameter like socio-status, profession of that individual, etc.

One can take any data set as per his/her interest from Kaggle.

One can also download data using the opendatasets Python library

!pip install jovian opendatasets --upgrade --quiet

Let's begin by downloading the data, and listing the files within the dataset.