Exploratory Data Analysis on 2017 freeCodeCamp Survey

freeCodeCampBanner

This project is the result of the knowledge acquired during the course Data Analysis with Python: Zero to Pandas offered by Jovian.ml in partnership with freeCodeCamp.

For this project, was chosen the open dataset 2017-new-coder-survey, which contains data collected from freeCodeCamp's 2017 survey of more than 20,000 developers. The main goal is to make an initial Exploratory Data Analysis and find some insights about the collected data. Multiple python libraries will be used for data manipulation, cleaning, and visualization.

Let's install some necessary Python libraries that we will be using

%%capture
! pip install numpy pandas matplotlib seaborn wordcloud jovian --upgrade

import jovian
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
import pandas as pd
import seaborn as sns
from wordcloud import WordCloud
import warnings

warnings.filterwarnings('ignore')

project_name='eda-freecodecamp-survey'
jovian.commit(project=project_name)

[jovian] Attempting to save notebook..
[jovian] Updating notebook "rocio-x-linares95/eda-freecodecamp-survey" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ml/rocio-x-linares95/eda-freecodecamp-survey

'https://jovian.ml/rocio-x-linares95/eda-freecodecamp-survey'

Data loading

The open dataset 2017-new-coder-survey is composed of two files:

2017-new-coder-survey-part-1.csv - the first half of the survey. 100% of respondents completed this section.
2017-new-coder-survey-part-2.csv - the first half of the survey, plus the second half - which about 95% of respondents also completed.

These files have a column in common: Network ID. So, It can be built a single dataset using this shared key.