Learn practical skills, build real-world projects, and advance your career

Exploratory Data Analysis: Data Professionals' Salaries in 2022

  • This project is about an Exploratory Data Analysis on the salaries of working professionals in the data industry in India (2022).
  • The dataset is taken from kaggle, it is made by the author by scraping Glassdoor, which is a wesbite that allows users to anonymously access and publish information about compabies, jobs, salaries and more.
  • The dataset contains information about the companies, salaries, locations and job titles of the professionals.
  • The goal of this project is to do an analysis of the data, extract information, visualize the data by plotting charts and graphs for easier understanding and finally gather some insights and make inferences.
  • This will be done using Python and some of its libraries like Pandas, NumPy, MatplotLib, Seaborn and more.

Importing the neccessary libraries

!pip install opendatasets --upgrade --quiet
!pip install jovian --upgrade -q

import os
import jovian
import matplotlib
import numpy as np
import pandas as pd
import seaborn as sns
import opendatasets as od
import matplotlib.pyplot as plt
from wordcloud import WordCloud
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.6/68.6 kB 3.1 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Building wheel for uuid (setup.py) ... done
# Setting parameters and styles for graphs

%matplotlib inline

sns.set_style('whitegrid')
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

Downloading the Dataset

  • The first thing to do before starting the data analysis is to retrive the dataset, here its being done using the opendatasets library.