Common preprocessing steps | Notion

Pandas Pipeline for data preprocessing steps

How to understand your data?

Here are some common preprocessing steps prior to feeding the data to a machine learning model:

Split your set before anything, if possible!

Replace / remove missing data Imputation - (missing data)
Remove / re-scale Scaling variables
Outlier
Encode categorical features Categorical features (kategorische features)
Discretization of continuous features
Normalize numeric feats
Variable transformation
Stratify partitions based on target variable
Resample / rebalance partitions

More in this tweet:

https://twitter.com/rasbt/status/1592969233713201152