How do I normalize the data from a csv file?
I did it this way:
def customize_dataset(dataframe_raw):
dataframe = dataframe_raw.copy(deep=True)
# drop some columns
dataframe = dataframe.drop(['longitude', 'latitude'], axis=1)
for col in ['housing_median_age', 'total_rooms', 'total_bedrooms', 'population',
'households', 'median_income' ,'median_house_value']:
# normalizing incoming data
dataframe[col] = (dataframe[col] - min(dataframe[col])) / (max(dataframe[col]) - min(dataframe[col]))
return dataframe
dataframe = customize_dataset(dataframe_raw)
dataframe.head()
There are several other normalization methods, logic is the same.
You can also write it in a more elegant way.
1 Like