Goto

Collaborating Authors

 binom


Differentially Private Uniformly Most Powerful Tests for Binomial Data

Neural Information Processing Systems

Furthermore, we obtain exactp-values, which are easily computed in terms of the Tulap random variable. We show that our results also apply to distribution-free hypothesis testsforcontinuous data.





Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets

arXiv.org Artificial Intelligence

Ensemble learning is a powerful machine learning tool; it enables us to transform weak learners; hypothesis classes that are barely better than guessing, into learners with state-of-the-art performance. In essence, ensemble methods take a set of base classifiers, weigh those classifiers according to performance on the training set and retrieve the final prediction by aggregating according to those weights. An important historical example is AdaBoost (Freund and Schapire [1997]), a type of voting classifier, which builds the ensemble classifier sequentially; new base classifiers are added to the ensemble to correct the mistakes of the current ensemble. AdaBoost was the first efficient and practical implementation of a boosting algorithm, and hence the relevance of ensemble learners is often attributed to AdaBoost. Much theoretical research has been done to explain the impressive practical performance of AdaBoost and other ensemble methods.



The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation

arXiv.org Artificial Intelligence

The application of Reinforcement Learning with Verifiable Rewards (RLVR) to mathematical and coding domains has demonstrated significant improvements in the reasoning and problem-solving abilities of Large Language Models. Despite its success in single generation problem solving, the reinforcement learning fine-tuning process may harm the model's exploration ability, as reflected in decreased diversity of generations and a resulting degradation of performance during Best-of-N sampling for large N values. In this work, we focus on optimizing the max@k metric, a continuous generalization of pass@k. We derive an unbiased on-policy gradient estimate for direct optimization of this metric. Furthermore, we extend our derivations to the off-policy updates, a common element in modern RLVR algorithms, that allows better sample efficiency. Empirically, we show that our objective effectively optimizes max@k metric in off-policy scenarios, aligning the model with the Best-of-N inference strategy.




Short Boolean Formulas as Explanations in Practice

arXiv.org Artificial Intelligence

We investigate explainability via short Boolean formulas in the data model based on unary relations. As an explanation of length k, we take a Boolean formula of length k that minimizes the error with respect to the target attribute to be explained. We first provide novel quantitative bounds for the expected error in this scenario. We then also demonstrate how the setting works in practice by studying three concrete data sets. In each case, we calculate explanation formulas of different lengths using an encoding in Answer Set Programming. The most accurate formulas we obtain achieve errors similar to other methods on the same data sets. However, due to overfitting, these formulas are not necessarily ideal explanations, so we use cross validation to identify a suitable length for explanations. By limiting to shorter formulas, we obtain explanations that avoid overfitting but are still reasonably accurate and also, importantly, human interpretable.