Learn practical skills, build real-world projects, and advance your career

transformation-and-scaling-of-numeric-data

Use the "Run" button to execute the code.

Transformation and Scaling of Numeric-Data

Most of the time when we are dealing with real world datasets, different columns have different units for the data - for instance one column can be in seconds and another can be kilometers. This means that certain columns may have a very high range of data compared to other columns.

When we use these varied ranges of columns in an Machine Learning (ML) model, there are chances that the columns having larger ranges will have more influence on the target. In order for the model to treat each and every column equally and fairly it is important to transform or scale the data in each column to a similar range. This transformation or scaling of data is also known as Feature Scaling.

In this notebook we are going to look at some Scaling techniques that can be used to scale numeric features.

Al these scalers can be imported from the sklearn.preprocessing Python library. It is an open-source library that is used for processing the data for machine learning and Data Science.

Downloading and reading the data

The first thing we'll do for is to download and read the data.

Before that we'll need to install and import all the necessary libraries.
After which we'll use the opendatasets Library of python by Jovian to download the data from Kaggle and we'll use pd.read_csv function of pandas library.

Let's start by installing and importing important libraries.