Bias-Variance Tradeoff Calculator
Decompose the expected Mean Squared Error (MSE) of a model into its three fundamental components: Bias², Variance, and Irreducible Error (σ²). Enter either (A) direct component values, or (B) a set of model predictions vs. the true value.
Mode A — Direct Component Input
Mode B — Predictions vs. True Value
Enter multiple model predictions (comma-separated) and the true target value. Bias = mean(predictions) − true; Variance = sample variance of predictions.
Formula
The Bias-Variance Decomposition of the expected Mean Squared Error at a point x:
E[(y − f̂(x))²] = Bias²[f̂(x)] + Var[f̂(x)] + σ²
where:
Bias[f̂(x)] = E[f̂(x)] − f(x) (systematic error)
Var[f̂(x)] = E[(f̂(x) − E[f̂(x)])²] (sensitivity to training set)
σ² = Var[ε] (irreducible noise in y = f(x) + ε)
Mode B estimators (given M predictions {f̂₁, …, f̂_M}):
Ê[f̂] = (1/M) Σ f̂ᵢ
Bias = Ê[f̂] − y
Var = (1/M) Σ (f̂ᵢ − Ê[f̂])² (population variance)
Assumptions & References
- The decomposition assumes a squared-error loss function; it does not directly apply to classification or other loss functions without modification.
- The true data-generating process is y = f(x) + ε where ε is zero-mean noise with variance σ².
- Bias and variance are properties of the learning algorithm averaged over all possible training sets of a fixed size, not of a single trained model.
- Mode B uses population variance (divide by n) to stay consistent with the theoretical expectation operator; use sample variance (divide by n−1) if estimating from a finite sample of algorithms.
- Irreducible noise σ² is a property of the data and cannot be reduced by any model.
- References:
- Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1–58.
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning (2nd ed.), §2.9. Springer.
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning, §3.2. Springer.