Label Encoding
Label encoding is a data preprocessing technique used to convert categorical data into numeric form so that machine learning algorithms can process it.
For example:
|
Color |
Encoded |
|
Red |
0 |
|
Green |
1 |
|
Blue |
2 |
So, "Red" → 0, "Green" → 1, "Blue" → 2
Example:
from sklearn.preprocessing import LabelEncoder
import pandas as pd
df = pd.DataFrame({
'City': ['Dhaka', 'Rajshai', 'Khulna', 'Chattogram', 'Barishal', 'Cumilla']
})
df
|
0 |
Dhaka |
|
1 |
Rajshai |
|
2 |
Khulna |
|
3 |
Chattogram |
|
4 |
Barishal |
|
5 |
Cumilla |
# Creating a LabelEncoder object
encoder = LabelEncoder()
df['City_enc'] = encoder.fit_transform(df['City'])
|
|
City |
City_enc |
|
0 |
Dhaka |
3 |
|
1 |
Rajshai |
5 |
|
2 |
Khulna |
4 |
|
3 |
Chattogram |
1 |
|
4 |
Barishal |
0 |
|
5 |
Cumilla |
2 |
No More
Statlearner
Statlearner