Steps to reducing overfitting

fastai book

A good advice from the fastai community is to start your model with overfitting. Afterwards reduce overfitting step by step. Ways to reduce overfitting (see tweet) starting from top to down:

Untitled

One approach that can be helpful when overfitting is weight decay, also called as L2 regularization **(see fastai video and summary Regularization - Ridge (L2) Regression). This means we add to our loss function the sum of all the weights squared. Why do that? Because when we compute the gradients, it will add a contribution to them that will encourage the weights to be as small as possible. The weight decay can be set when using One Cycle Policy (fastai) as shown here. Best practices are explained as well.