The problem

- Plot loss measure against some measure of learning effort of the model (this could be epochs for neural networks, or rounds for gradient boosting).
- The same thing happens even if you change the hyperparameters, process the data, or decide on a different model (too complex model for them problem) altogether.