Kaggle & Kaggle Tricks

Golden Rules of ML

Golden Rules: EDA

Useful baselines

Create a Baseline Model

Baselines are for measuring improvement, not just a vacuous number.

The best know to use realistic ones, such as human performance and intelligent heuristics.

The best model is the simplest, so always start there and see how much better you can get.

Starting strong

Think of the potential solutions to an ML problem as a nearly-infinite tree.

The most successful modelers can prune all but 0.1% of that tree before they even start.

If you know something won't work, why waste your valuable time?

Rapid iteration

If we could know what techniques would work before attacking a problem, then we wouldn't need data scientists.

After pruning the bad ideas, the best modelers efficiently explore the remaining possibilities to land on the best solution.

Start small, work fast.