Posts

Showing posts with the label SKlearn

Decision Tree Classification in Machine Learning

Image
 What is Decision Tree Classification? A  decision tree  is a graphical representation of all the possible solutions to a  decision  based on certain conditions. It's called a  decision tree  because it starts with a single box (or root), which then branches off into a number of solutions, just like a  tree . Lets see one example : is person salary >>> 100 K $ ? if we make a decision using  Decision Tree classification  then it look like above. lets say if person work in google and he/her has done master in computer then definitely his/her salary > 100 K $.    Decision Tree classification in Python : Import library first  Read the data sets  Now we create two data frame which contain our independent and dependent variable. Because we have to pass this variable as x and y in our model.   Now we all know ml does not  contain any categorical data but in our data frame columns name company job a...

what is Logistic Regression ?

Image
  Implementation of  Logistic Regression  Logistic regression is the appropriate regression analysis to conduct when the dependent variable is binary. Basically logistic regression is working on this formula : Now the question is there is already linear regression available then why we need logistic regression?  Logistic regression vs linear regression : Lets understand the concept of why we need logistic regression. consider below graph and fit line using leaner regression.  so you can clearly saw that it's a very inappropriate way to fit line and our ml model accuracy look bad.  Lets now fit the line using Logistic regression  You can clearly see the difference which method is useful and accurate.  Logistic regression in python : Now lets do some coding We will predict the person have insurance or not on the basis of his/her age. First we import the library Lets read the data from the csv file. visualising data using scatter plot Lets ...

Split available dataset into training and test

Image
 How to  split data set into training and test data set   We can train the model using data which we call as training data or training set. The training data is the one which already has the actual value that the model should have predicted and thus the algorithm changes the value of parameters to account for the data in the training set. But how do we know after training the model is overall good ? For that, we have test data/test set which is basically a different data for which we know the values but this data was never shown to the model before. Thus if the model after training is performing good on test set as well then we can say that the Machine Learning model is good. If the model is not tested and is made such that it just perform good on training data then parameters will be such that they are only good enough to predict the value for data which was in training set. That is not general. This is called overfitting. So we don’t land making a useless model which is...

Linear Regression With Multiple Variable

Image
  Linear Regression with Multiple Variable Here we predict the house price by its area,bedroom & age. first we import necessary library Now we read CSV file by using Pandas library's read_csv method Here you can see Nan value for bedroom so we have to fill something there . so we find the median of bedroom column and replace the Nan value with median. for fill the Nan value we use fillna method of pandas library Now we build the Linear Regression model using sklearn librabry Here we pass the independent variable area,bedroom and age by using df.drop('price') which means all column except price and then we pass dependent variable in fit method which is price column.Fit method train our linear Regression model. Lets predict the price of house. Here we predict the price of house which has a 3000 sqft area , 3 bedroom and 40  year old. You can verify your answer by simple maths equation  y = m1(coef_)+m2(coef_)+m3(coef_)+intercept   where y is dependent variable...

Linear Regression

Image
  Linear Regression Using SKlearn We will predict the hosing price here by creating Machine Learning's Linear Regression Model. let's start : First we import the necessary library  Now we import the data from csv file. You can import csv file in python using pandas library's read_csv  method. lets visualise the data using matplotlib library we have to supply X & Y variable  means independent and dependent variable in leaner model so here independent variable is area and we will here predict price of home using area so price is depend on area size so price is dependent variable. you can drop column in pandas using drop   method . here x = new_df & y = df.price Now we build a leaner regression model  here we have to pass our x & y in fit method which will train our model. Lets predict the price of house which have 5000 sqft area. Predict method always accept 2D Array thats why we use double bracket here. price of 5000 sqft area house is 85955...

Recognizing Handwritten Digits with scikit-learn

Image
Recognizing Handwritten Digits With scikit-learn First we import necessary  library  and load data   We can see this digits using matshow  Now we split over data set using sklearn library Now we train our model and fit In machine learning, the  radial basis function kernel , or  RBF kernel , is a popular  kernel  function used in various kernelized learning algorithms. In particular, it is commonly used in support vector machine classification. Now we predict the digit We Check the score of our model which means how much model is accurate