XGBoost and imbalanced datasets
- Dummy Model for imbalanced data
- Kann auch als Outlier interpretiert werden. In diesem Video lernt man, wie man schnell überprüft, ob man die imbalanced data mittels Outlier Methoden oder gewöhnlichen Klassifizierungs-Methoden abarbeiten sollte.
- The best dummy model when dealing with imbalanced data. Check out this!
- Which metric to use when dealing with imbalanced data → tweet
Application
Handling Imbalanced Classes
Handling Imbalanced Data with Logistic Regression
Libraries
- HuggingFace book recommends following library for imbalanced data:
imbalanced-learn
General Strategies
What can we do when working with imbalanced datasets?
👉 undersample the majority class(es)
👉 over-sample the minority class(es)
👉 create synthetic examples
Advice from Santiago (tweet):
<aside>
💡 Avoid any method that artificially changes the distribution of your data. Don't "fix" your imbalanced data.
</aside>