Python Machine Learning Label Encoding

Python Machine Learning Label Encoding – When we do classification, there will be a lot of Labels that we are going to deal with that these Labels can be in the form of words, numbers or something else, so when we are using Sklearn it expects numbers. so if the data are numbers then there is no problem, we can use them directly to start training. But this is not usually the case. In the real world, labels are in the form of words, because words are human readable. We label our training data with words so that the mapping can be tracked. To convert word labels into numbers, we need to use a label encoder. Label encoding refers to the process of transforming the word labels into numerical form. This enables the algorithms to operate on our data.

 

 

You can read more articles on Python Machine Learning

 

 

 

So this is the code for the Python Machine Learning Label Encoding

 

 

 

These line of code are our sample data.

 

 

And this is the mapping between words and numbers.

 

 

 

This is the result

 

 

 

Python Machine Learning Label Encoding The Data
Python Machine Learning Label Encoding The Data

 

 

 

 

Let’s encode a set of randomly ordered labels to see how it performs.

 

 

 

Add these lines of codes to above code

 

 

 

This is the result

Machine Learning Label Encoding
Machine Learning Label Encoding

 

 

So now we are going to decode a random set of numbers.

 

 

 

 

Add these lines of codes

 

 

 

 

This is the result

Python Machine Learning Decoded Values
Python Machine Learning Decoded Value

 

 

 

Complete source code for this article 

 

 

 

FAQs:

 

What is label encoding in Python?

Label encoding is a technique used to convert categorical data into numerical data in Python. It assigns a unique integer to each category in the dataset. For example, if you have categories like “red,” “green,” and “blue,” label encoding would assign them integers like 0, 1, and 2.

 

 

 

How to label data for machine learning in Python?

For labeling data for machine learning in Python, you can use the LabelEncoder class from the sklearn.preprocessing module. You fit the encoder to your categorical data and after that transform it to convert the categories into numerical labels.

 

 

When to use LabelEncoder and OneHotEncoder?

  • LabelEncoder: Use LabelEncoder when you have a categorical feature with ordinal relationships between categories (e.g., “low,” “medium,” “high”). It assigns a unique integer to each category, preserving the ordinality.
  • OneHotEncoder: Use OneHotEncoder when you have nominal categorical features (categories with no inherent order) or when you want to avoid imposing any ordinal relationship between categories. It creates binary columns for each category, indicating its presence or absence in the original feature.

 

LabelEncoder Example

 

 

OneHotEncoder Example

 

Subscribe and Get Free Video Courses & Articles in your Email

 

1 thought on “Python Machine Learning Label Encoding”

Leave a Comment

Share via
Copy link
Powered by Social Snap
×