Bias–variance tradeoff - Wikipedia
Suppose that we have a training set consisting of a set of points x 1, …, x n {\displaystyle x_{1},\dots,x_{n}} and real values y i {\displaystyle y_{i}} associated with each point x i {\displaystyle x_{i}} . We want to find a function f ( x) {\displaystyle {\hat {f}}(x)}, that approximates the true function f ( x) {\displaystyle f(x)} as well as possible, by means of some learning algorithm. We make "as well as possible" precise by measuring the mean squared error between y {\displaystyle y} and f ( x) {\displaystyle {\hat {f}}(x)}: we want ( y f ( x)) 2 {\displaystyle (y-{\hat {f}}(x)) {2}} to be minimal, both for x 1, …, x n {\displaystyle x_{1},\dots,x_{n}} and for points outside of our sample. Of course, we cannot hope to do so perfectly, since the y i {\displaystyle y_{i}} contain noise ε {\displaystyle \varepsilon }; this means we must be prepared to accept an irreducible error in any function we come up with. Finding an f {\displaystyle {\hat {f}}} that generalizes to points outside of the training set can be done with any of the countless algorithms used for supervised learning.
Nov-23-2019, 22:48:23 GMT
- Technology: