Hyperparameter Optimization

Scikit-learn for Hyperparameter Tuning

Overfitting

Random Search

Splitting strategies

<aside> 💡

Finding the right parameters only works through experiments. How to conduct and evaluate these experiments can be seen in the XGBoost book, pages 92-96. But keep in mind: First tune the architecture, don’t tune the parameters! If you want to train multiple architectures check this.

</aside>

Grid Search is the most basic method for hyperparameter optimization. Once the machine learning engineer figures out which values of hyperparameters they want to assess, the grid search will compute all possible combinations of those values.

Back to the example of the XBGClassifier, if we want to evaluate the following values for n_estimator: [10, 100, 500, 1000] and for max_depth: [None, 1, 2, 3], a Grid Search will create all possible combinations with these values (4x4=16 combinations), resulting in:

[10, None], [10, 1], [10, 2], [10, 3]

[100, None], [100, 1], [100, 2], [100, 3]

[500, None], [500, 1], [500, 2], [500, 3]

[1000, None], [1000, 1], [1000, 2], [1000, 3]

The search would then proceed to train 16 machine learning models with each of these combinations, determine the performance of each model, and select the combination of hyperparameter values that returns the best value for the performance metric.

<aside> 💡 Einsatz von GridSearchCV innerhlab einer Pipeline sieht man hier.

</aside>

GridSearchCV Anwendungen

GridSearchCV on calmcode

GridSearch Artikel

Calmcode: