Probabilistic estimators naturally require more computations than a simple train-test split, but they offer more confidence that you are correctly estimating the right measure: the general performance of your model. It is based on the theorem called the Law of Large Numbers.
k-fold cross-validation stratified sampling (Stratified k-Fold)
But how do you pick the best one to estimate your model’s generalization error?
