Decision Tree Classification in Machine Learning

 What is Decision Tree Classification?


decision tree is a graphical representation of all the possible solutions to a decision based on certain conditions. It's called a decision tree because it starts with a single box (or root), which then branches off into a number of solutions, just like a tree.

Lets see one example :



is person salary >>> 100 K $ ?
if we make a decision using Decision Tree classification  then it look like above. lets say if person work in google and he/her has done master in computer then definitely his/her salary > 100 K $.   

Decision Tree classification in Python :

Import library first 



Read the data sets 


Now we create two data frame which contain our independent and dependent variable. Because we have to pass this variable as x and y in our model.


 


Now we all know ml does not  contain any categorical data but in our data frame columns name company job and degree contain categorical data so we have to convert it into numbers.

For converting categorical data in to number we will use sklearn's LabelEncoder module.



You can see that now categorical data convert into integer so we now drop the unnecessary columns.


 
Now we start building our Decision Tree Classification  model which will predict the person's salary is greater than 100 K $ or not?



Here you can see we pass variable [1,0,0] in predict method which means 1 = Facebook , 0 = business manager and last 0 = bachelors . so our model predict 1 which means salary > 100 k $.

If you want to download data set then click here.

Comments

Popular posts from this blog

Decision tree for titanic dataset in Python

Multivariate logistic regression in Python

K Means Cluster Algorithm