Samory Kpotufe
PAC-Bayes Tree: weighted subtrees with guarantees
Tin D. Nguyen, Samory Kpotufe
We present a weighted-majority classification approach over subtrees of a fixed tree, which provably achieves excess-risk of the same order as the best tree-pruning. Furthermore, the computational efficiency of pruning is maintained at both training and testing time despite having to aggregate over an exponential number of subtrees. We believe this is the first subtree aggregation approach with such guarantees. The guarantees are obtained via a simple combination of insights from PAC-Bayes theory, which we believe should be of independent interest, as it generically implies consistency for weighted-voting classifiers w.r.t. Bayes - while, in contrast, usual PAC-bayes approaches only establish consistency of Gibbs classifiers.
On the Value of Target Data in Transfer Learning
Steve Hanneke, Samory Kpotufe
PAC-Bayes Tree: weighted subtrees with guarantees
Tin D. Nguyen, Samory Kpotufe
We present a weighted-majority classification approach over subtrees of a fixed tree, which provably achieves excess-risk of the same order as the best tree-pruning. Furthermore, the computational efficiency of pruning is maintained at both training and testing time despite having to aggregate over an exponential number of subtrees. We believe this is the first subtree aggregation approach with such guarantees. The guarantees are obtained via a simple combination of insights from PAC-Bayes theory, which we believe should be of independent interest, as it generically implies consistency for weighted-voting classifiers w.r.t. Bayes - while, in contrast, usual PAC-bayes approaches only establish consistency of Gibbs classifiers.
On the Value of Target Data in Transfer Learning
Steve Hanneke, Samory Kpotufe
We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy. To this aim, we establish the first minimax-rates in terms of both source and target sample sizes, and show that performance limits are captured by new notions of discrepancy between source and target, which we refer to as transfer exponents. Interestingly, we find that attaining minimax performance is akin to ignoring one of the source or target samples, provided distributional parameters were known a priori. Moreover, we show that practical decisions - w.r.t.