Kaggle & Kaggle Tricks
Golden Rules: Evaluation
Golden Rules: EDA
<aside>
💡 Every programming tip I have after over decade of writing software:
Coding is the last step of the process.
</aside>
- The first priority in any data science project is agreeing on success criteria. There are a million ways to define value in the business world.
- Work on your coding skills. You're always going to have to change your pipelines, and spaghetti code makes this take 100x longer.
- If you work for a medium+ sized company, go buy someone from every department lunch and ask them about their day-to-day problems. It's easy to get disconnected from reality in a DS role.
- Data collection and curation are the most underrated skills out there. Very rarely are you just given a formed dataset.
- 5% of your job is actually training the models, and the other 95% is working with data and managing stakeholders. Plan accordingly.
- Successful academics think innovative methods are cool (nothing wrong with that). Successful applied data scientists think the simplest effective solution is cool.
Keep it simple - no code and improve later
The best code is the one you don’t have to write.
The best tool is the one you already have.
The best solution is the simplest one.
Always make it work first. Make it better later.
Model evaluation is the best time saver