- Last session we talked about two sets,the training and test set
- Training set is used in the creation of the model
- We can run the model on the training set to determine the correct weights and biases
- We then use a seperate test,to see how the model performs on new examples
- The central challenge in machine learning is to make our model perform well on previously unseen inputs
- Typically,when training a machine learning model,we compute the "loss" on the training set and try to minmimise it using Gradient Descent
- This is simply Optimisation,but in Machine Learning,we are also interested in minimising the test error
- When we use a ML model,we do not set the parameteres ahead of time ,we first train it on the training set,then se the parameters to test it on the test set
- Under this process,the expected test set error is greater than or equal to the expected value of the training error
- So,there are mainly two factors which determine how well a ML model will perform
- The first one is, the training error should be small
- Secondly, the gap between the training and test error should be small
- These two factors are the reason for the two biggest challenges in ML, Overfitting and Underfitting
- Undeffitting occurs,when the model is not able to obtain a sufficiently low error value
- Overfitting occurs, when the gap between the training and test error is too large
- To alleviate the problem of underfitting and overfitting,we implement Regularisation
- Last lecture, we talked about features,and how each feature contributes differently to the loss function
- Sometimes, we have to express preference for one feature over the other
- For example, we might express a preference for linear features over non-linear features
- There are many ways of expressing preferences, together they are called "Regularisation"
- There are many type of Regularisation,L2 regularisation,Dropout regularisation
- In dropout regularisation, we take out random nodes for every iteration.
- The intuition behind dropout regularisation is that, we can't rely on one feature,so we have to spread out the weights
- Till now,we talked about the test set and the training set
- However,there is one more set,called the Validation set
- Earlier,we talked about test set,which we use to test how well the model generalises to new examples
- But,it is also important that we don't use the test set to fine tune our Hyperparameters
- Therefore,we need a new set,this is the validation set.It is always constructed from the training set,by splitting it into two
- We use the validation set to estimate the generalisation error and update our hyperparameters accordingly