Learn practical skills, build real-world projects, and advance your career

Exploratory Data Analysis Case Study - Stack Overflow Developer Survey

This notebook is a part of the Zero to Data Science Bootcamp by Jovian

alt

Exploratory Data Analysis (EDA) is the process of exploring, investigating and gathering insights from data using statistical measures and visualizations. The objective of EDA is to develop and understanding of data, by uncovering trends, relationships and patterns.

EDA is both a science and an art. On the one hand it requires the knowledge of statistics, visualization techniques and data analysis tools like Numpy, Pandas, Seaborn etc. On the other hand, it requires asking interesting questions to guide the investigation and interpreting numbers & figures to generate useful insights.

The following topics are covered in this tutorial:

  • Downloading a dataset from an online source
  • Data preparation and cleaning with Pandas
  • Open-ended exploratory analysis and visualization
  • Asking and answering interesting questions
  • Summarizing inferences and drawing conclusions

How to Run the Code

The best way to learn the material is to execute the code and experiment with it yourself. This tutorial is an executable Jupyter notebook. You can run this tutorial and experiment with the code examples in a couple of ways: using free online resources (recommended) or on your computer.

Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Binder. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on Google Colab or Kaggle to use these platforms.

Option 2: Running on your computer locally

To run the code on your computer locally, you'll need to set up Python, download the notebook and install the required libraries. We recommend using the Conda distribution of Python. Click the Run button at the top of this page, select the Run Locally option, and follow the instructions.

Before we begin, let's install the required libraries and save a copy of this notebook to Jovian.

!pip install numpy pandas==1.1.5 wordcloud jovian opendatasets matplotlib seaborn plotly folium --upgrade --quiet