Tips for Getting Started
When starting a machine learning project, there are several tips to consider for efficient progress.
Getting Started on Modeling
- Conduct a literature search to understand existing approaches and possibilities.
- Focus on practical implementations instead of the latest algorithms.
- Open source implementations can provide a baseline and facilitate quick start.
- A reasonable algorithm with good data often outperforms a cutting-edge algorithm with poor data.
Deployment Constraints
Consider deployment constraints, such as compute limitations, when confident about the project’s success.
- If the project is still at an early stage, it’s acceptable to prioritize establishing a baseline over deployment constraints.
Sanity Checks
- Before running the learning algorithm on all data, perform quick sanity checks.
- Overfit a small training dataset to ensure the algorithm is functioning correctly.
- Test the algorithm with one or a few training examples to catch bugs early.
- Training on a small subset of images can reveal issues before scaling up.
Error Analysis and Performance Auditing
- After training the initial model, conduct error analysis to identify areas for improvement.
- Analyze errors to understand patterns and make targeted adjustments.
- Carry out performance auditing to assess the model’s overall performance before deployment.