Split available dataset into training and test
How to split data set into training and test data set We can train the model using data which we call as training data or training set. The training data is the one which already has the actual value that the model should have predicted and thus the algorithm changes the value of parameters to account for the data in the training set. But how do we know after training the model is overall good ? For that, we have test data/test set which is basically a different data for which we know the values but this data was never shown to the model before. Thus if the model after training is performing good on test set as well then we can say that the Machine Learning model is good. If the model is not tested and is made such that it just perform good on training data then parameters will be such that they are only good enough to predict the value for data which was in training set. That is not general. This is called overfitting. So we don’t land making a useless model which is...