Risk-Controlling Model Selection via Guided Bayesian Optimization

Laufer-Goldshtein, Bracha, Fisch, Adam, Barzilay, Regina, Jaakkola, Tommi

Dec-4-2023–arXiv.org Machine Learning

Our goal in this paper is to find a configuration that adheres to user-specified limits on certain risks while being useful with respect to other conflicting metrics. We solve this by combining Bayesian Optimization (BO) with rigorous risk-controlling procedures, where our core idea is to steer BO towards an efficient testing strategy. Our BO method identifies a set of Pareto optimal configurations residing in a designated region of interest. The resulting candidates are statistically verified and the best-performing configuration is selected with guaranteed risk levels. We demonstrate the effectiveness of our approach on a range of tasks with multiple desiderata, including low error rates, equitable predictions, handling spurious correlations, managing rate and distortion in generative models, and reducing computational costs. Deploying machine learning models in the real-world requires balancing different performance aspects such as low error rate, equality in predictive decisions (Hardt et al., 2016; Pessach & Shmueli, 2022), robustness to spurious correlations (Sagawa et al., 2019; Yang et al., 2023), and model efficiency (Laskaridis et al., 2021; Menghani, 2023). In many cases, we can influence the model's behavior favorably via sets of hyperparameters that determine the model configuration. However, selecting such a configuration that exactly meets user-defined requirements on test data is typically non-trivial, especially when considering a large number of objectives and configurations that are costly to assess (e.g., that require retraining large neural networks for new settings). Bayesian Optimization (BO) is widely used for efficiently selecting configurations of functions that require expensive evaluation, such as hyperparameters that govern the model architecture or influence the training procedure (Shahriari et al., 2015; Wang et al., 2022; Bischl et al., 2023). The basic concept is to substitute the costly function of interest with a cheap, and easily optimized, probabilistic surrogate model. This surrogate is used to select promising candidate configurations, while balancing exploration and exploitation.

artificial intelligence, configuration, machine learning, (16 more...)

arXiv.org Machine Learning

Dec-4-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Performance Analysis > Accuracy (0.86)
    - Statistical Learning (0.82)
  - Representation & Reasoning > Optimization (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found