Learn practical skills, build real-world projects, and advance your career

Categorical Encoding

This kernel covers some of the commonly used Categorical Encoding Techniques .

1.OneHot Encoding
2.Label Encoding
3.Ordinal Encoding
4.Binary Encoding
5.Frequency Encoding
6.Mean Encoding

# importing libraries
import numpy as np 
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
import category_encoders as ce
%matplotlib inline
# loading dataset 
df = pd.read_csv("../input/widsdatathon2020/training_v2.csv")
#printing the categorical variables
print([c for c in df.columns if (1<df[c].nunique()) & (df[c].dtype != np.number)& (df[c].dtype != int) ])
['ethnicity', 'gender', 'hospital_admit_source', 'icu_admit_source', 'icu_stay_type', 'icu_type', 'apache_3j_bodysystem', 'apache_2_bodysystem']
categorical_cols =  ['hospital_id','ethnicity', 'gender', 'hospital_admit_source', 'icu_admit_source', 'icu_stay_type', 'icu_type', 'apache_3j_bodysystem', 'apache_2_bodysystem',"hospital_death",'age']