Data Preprocessing & Feature Engineering

Common preprocessing steps

Untitled

EDA ideas from this article:

<aside> 💡 Generally you won't always check all of these points, however, it is important to cover as many as you need to have an understanding of your data before moving to the next steps of the Data Science process

</aside>

Distribution of Data:

Imbalanced Data

Transform skewed variables

Assessing the distribution of data (e.g., normal, skewed) using histograms, box plots, and summary statistics helps understand the central tendency and variability.

Missing Values: