Dimensionality Reduction

<aside> 💡 If you overload your models with features, it will hurt performance.

</aside>

The most basic and fastes way to select important features (feature importance) is to use Random Forest, as explained in fastai tutorial and in my notes. Since RF is a good model to start with, it makes sense to use it. Assuming that the RF model makes reasonably good predictions (→ understands the data well), Jeremy Howard would use it as the best technique to select features, as explained here.

Open Source tool Feature-Engine offers multiple techniques to select features.

Untitled

Ways to do Feature Selection

Code Examples

Using feature importance to evaluate your work

Selecting Important Features in Random Forests

Identifying Important Features in Random Forests