Bias Variance Tradeoff


Note

MSE = bias$^2$ + variance

img image source: https://www.codingninjas.com/codestudio/library/bias-variance-tradeoff

Bias (Underfitting)

$$Bias[Y_{pred}|Y_{true}] = E[Y_{pred}] - Y_{true}$$ the **bias** error is an error from incorrect assumption in the learning algorithm. high bias can cause an algorithm to miss the relevant relations between features and target output (**underfitting**) (systematic off-the mark)

Variance (Overfitting)

$$ Var[Y_{pred}|Y_{true}] = E[(Y_{pred} - Y_{true})^2] - E[Y_{pred} - Y_{true}]^2$$

the variance is an error from sensitivity of the training data. high variance may result from the algorithm model noise from the training data (overfitting)

Avoid Overfitting
  1. early stopping
  2. train with more clean data
  3. data augmentation: sometimes adding some noise can stabilize the model
  4. feature selection
  5. regularization
  6. ensemble method

important additional read: https://arxiv.org/pdf/1812.11118.pdf

MSE

$$MSE = E[(Y_{pred} - Y_{true})^2]$$


$$MSE = (E[Y_{pred}] - Y_{true})^2 + (E[(Y_{pred} - Y_{true})^2] - E[Y_{pred} - Y_{true}])$$ $$MSE = Var[Y_{pred}|Y_{true}] + E[Y_{pred} - Y_{true}]^2$$ since $Y_{true}$ constant: $$MSE = Var[Y_{pred}|Y_{true}] + (E[Y_{pred}] - Y_{true})^2$$ $$MSE = Var[Y_{pred}|Y_{true}] + Bias[Y_{pred}|Y_{true}]^2$$


References

  1. https://medium.com/towards-data-science/why-is-mse-bias%C2%B2-variance-dbdeda6f0e70
  2. https://www.ibm.com/cloud/learn/overfitting#toc-overfittin-7bQGsQfX