Batch Size
Gradient Descent
- As explained here start with small learning rate and keep increasing it.
- Practical but simple example of applying the learning rate → here
- Find best learning rate: One Cycle Policy (fastai)
fastai book


Find the best learning rate
One Cycle Policy (fastai)
- Use discriminative learning rates, as explained here.
To find the best learning rate according to fastai and Jeremy Howard, you should use the learning rate finder, which is a key feature in the fastai library. Here's how it works:
- The learning rate finder runs your model for a few iterations, starting with a very small learning rate and gradually increasing it.
- It plots the loss against the learning rate on a log scale.
- You should select a learning rate that's somewhere in the middle of the sharpest downward slope on this plot, typically one magnitude lower than the point where the loss starts to increase again.