Ordinal Encoding using Scikit-learn
Goal¶
This post aims to convert one of the categorical columns for further process using scikit-learn:
Library¶
In [1]:
import pandas as pd
import sklearn.preprocessing
Create categorical data¶
In [2]:
df = pd.DataFrame(data={'type': ['cat', 'dog', 'sheep'],
'weight': [10, 15, 50]})
df
Out[2]:
Ordinal Encoding¶
Ordinal encoding is replacing the categories into numbers.
In [3]:
# Instanciate ordinal encoder class
oe = sklearn.preprocessing.OrdinalEncoder()
# Learn the mapping from categories to the numbers
oe.fit(df.loc[:, ['type']])
Out[3]:
In [4]:
# Apply this ordinal encoder to new data
oe.transform(pd.DataFrame(['cat'] * 3 +
['dog'] * 2 +
['sheep'] * 5))
Out[4]:
Comments
Comments powered by Disqus