Learn practical skills, build real-world projects, and advance your career

About the Dataset

It contains data of 99 standard metropolitan areas in the US. The data set provides information on 10 variables for each area for the period 1976-1977. The areas have been divided into 4 geographic regions: 1=North-East, 2=North-Central, 3=South, 4=West.

Link to the Dataset: https://bit.ly/SMA_Dataset

We are first importing packages using their standard alias names: pd for pandas, np for numpy, plt for matplotlib.pyplot and sns for seaborn

%matplotlib inline

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
import jovian

Using the Pandas read_csv method to read the Dataset's CSV file. Inside the brackets, we can either specify the path of the file or a downloadable link.

Here, the file is uploaded on GitHub and we can directly use the link to load and access it. If you are uploading your own file, make sure to specify the full path of the file.

I am storing the dataset in the variable 'data'

data = pd.read_csv(r"C:\Users\STU\Desktop\Data Analysis\Standard Metropolitan Areas Data - train_data - data.csv")