Probabilistic estimators naturally require more computations than a simple train-test split, but they offer more confidence that you are correctly estimating the right measure: the general performance of your model. It is based on the theorem called the Law of Large Numbers.

Validation Strategies

k-fold cross-validation

LOO: this comes from k-fold cross-validatin, when k = n (where n is the number of examples)
stratified sampling (Stratified k-Fold)

Cross-validation comes in many forms

k-fold cross-validation stratified sampling (Stratified k-Fold)

Leave-One-Out,
group-based CV
time-based CV

But how do you pick the best one to estimate your model’s generalization error?