A Gradient-Based Boosting Algorithm for Regression Problems
–Neural Information Processing Systems
Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typ(cid:173) ically binary-valued. The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1), ..., (xn, Yn)}, drawn randomly from X x Yac(cid:173) cording to distribution D; it outputs a hypothesis I: X -t Y, whose error is the expected value of a loss function on I(x), g(x), where X is chosen according to D. Given f, cl 0 and access to random examples, a strong learning procedure outputs with probability 1 - cl a hypothesis with error at most f, with running time polyno(cid:173) mial in 1/ f, 1/ cl and the number of examples. A weak learning procedure satisfies the same conditions, but where f need only be better than random guessing. Schapire (1990) showed that any weak learning procedure, denoted WeakLeam, can be efficiently transformed ("boosted") into a strong learning procedure. The AdaBoost algorithm achieves this by calling WeakLeam multiple times, in a se(cid:173) quence of T stages, each time presenting it with a different distribution over a fixed training set and finally combining all of the hypotheses. The algorithm maintains a weight w: for each training example i at stage i, and a distribution D t is computed by normalizing these weights.
Neural Information Processing Systems
Apr-6-2023, 17:02:29 GMT
- Technology: