Main parameters in XGBoost

eta (learning rate)

The learning rate controls the step size at which the optimizer makes updates to the weights. A smaller eta value results in slower but more accurate updates, while a larger eta value results in faster but less accurate updates. It is common to start with a relatively high value and then gradually decrease it. For example, you can start with eta = 0.1 and decrease it by a factor of 0.1 every 10 rounds. However, setting a too small eta value can lead to slow convergence and a too high value can lead to underfitting.

max_depth

The max_depth parameter controls the maximum depth of the trees in the model. A larger max_depth value results in more complex models, which can lead to overfitting. A smaller max_depth value results in simpler models, which can lead to underfitting. It is common to start with a small value, such as max_depth = 3 and increase it until the performance on the validation set stops improving.

subsample

The subsample parameter controls the fraction of observations used for each tree. A smaller subsample value results in smaller and less complex models, which can help prevent overfitting. A larger subsample value results in larger and more complex models, which can lead to overfitting. It is common to set this value between 0.5 and 1. For example, starting with subsample = 0.8 and gradually decrease it to 0.5 to prevent overfitting.

colsample_bytree

The colsample_bytree parameter controls the fraction of features used for each tree. A smaller colsample_bytree value results in smaller and less complex models, which can help prevent overfitting. A larger colsample_bytree value results in larger and more complex models, which can lead to overfitting. It is common to set this value between 0.5 and 1. For example, starting with colsample_bytree = 0.8 and gradually decrease it to 0.5 to prevent overfitting.

lambda

The lambda parameter is the L2 regularization term on weights. Larger values means more conservative model, it helps to reduce overfitting by adding a penalty term to the loss function. It is common to start with a relatively small value, such as lambda = 1 and increase it until the performance on the validation set stops improving.

alpha

The alpha parameter is the L1 regularization term on weights. Larger values means more conservative model, it helps to reduce overfitting by adding a penalty term to the loss function. It is common to start with a relatively small value, such as alpha = 0 and increase it until the performance on the validation set stops improving.