Overfitting
https://twitter.com/svpino/status/1595160215854292993
- An epoch is one complete pass through the dataset. The number of epochs you select will largely depend on how much time you have available, and how long you find it takes in practice to fit your model. If you select a number that is too small, you can always train for more epochs later. In this fastai tutorial we explain the batch size and what we should consider when choosing it. It is basically the number of rows we through into the Neural Network.
- Gradient accumulation is a technique where you can train on bigger batch sizes than your machine would normally be able to fit into memory.
- How do you pick an optimal batch size and whats the influence on the learning rate? Rule of thump: Pick the largest one you can! It’s faster! Learning rate follows the same rules: If you divide batch size by 2, you also divide learning rate by 2.