Dummy Variables & One Hot Encoding

Dummy Variables vs One Hot Encoding

Dummy variable:

You replace the categorical variable by different boolean variables (taking value 0 or 1) to encode whether or not the categorical value had a certain value. For encoding a categorical variable that can take k values, you only need k-1 dummy variables.
Often used in more statistical domains as it uses the “correct number of degrees of freedom”.

One-hot encoding :

You replace the categorical variable by a vector indicating “in which dimension” your variables lives. This vector will have k dimensions.
Often used in CS domains.

Lets code

First of all we importing Pandas library

Now we read the data from csv file using read_csv method of panda library

Here you can clearly see that our data set has a categorical data which is town column . so we know that categorical data can not acceptable in machine learning so we have to handle this problem and the solution is Dummy variable.

You can use the get_dummies method of pandas library which convert the town column in 0 & 1.

so, now we can use this dummy variable in our machine learning model.

if you want to download data set then click here.

You can download code by downloading my github repository.

Comments

Vipul KachhadiyaJune 2, 2021 at 8:19 AM
👍
ReplyDelete
Replies

Search This Blog

Code