Offline Model-Based Optimization: Comprehensive Review
Kim, Minsu, Gu, Jiayao, Yuan, Ye, Yun, Taeyoung, Liu, Zixuan, Bengio, Yoshua, Chen, Can
–arXiv.org Artificial Intelligence
Offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. This setting is particularly relevant when querying the objective function is prohibitively expensive or infeasible, with applications spanning protein engineering, material discovery, neural architecture search, and beyond. The main difficulty lies in accurately estimating the objective landscape beyond the available data, where extrapolations are fraught with significant epistemic uncertainty. This uncertainty can lead to objective hacking(reward hacking), exploiting model inaccuracies in unseen regions, or other spurious optimizations that yield misleadingly high performance estimates outside the training distribution. Recent advances in model-based optimization(MBO) have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Trained with carefully designed strategies, these models are more robust against out-of-distribution issues, facilitating the discovery of improved designs. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review. To bridge this gap, we present the first thorough review of offline MBO. We begin by formalizing the problem for both single-objective and multi-objective settings and by reviewing recent benchmarks and evaluation metrics. We then categorize existing approaches into two key areas: surrogate modeling, which emphasizes accurate function approximation in out-of-distribution regions, and generative modeling, which explores high-dimensional design spaces to identify high-performing designs. Finally, we examine the key challenges and propose promising directions for advancement in this rapidly evolving field including safe control of superintelligent systems.
arXiv.org Artificial Intelligence
Mar-21-2025
- Country:
- South America > Chile
- Oceania
- Palau (0.04)
- Australia > New South Wales
- Sydney (0.04)
- North America
- United States
- Maryland > Baltimore (0.04)
- Nevada (0.04)
- Washington > King County
- Bellevue (0.04)
- New York
- New York County > New York City (0.14)
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.05)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > Los Angeles County
- Long Beach (0.14)
- Canada
- Quebec > Montreal (0.14)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- Austria > Vienna (0.14)
- France (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Spain
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Andalusia > Cádiz Province
- Cadiz (0.04)
- Valencian Community > Valencia Province
- Netherlands > North Brabant
- Eindhoven (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- Africa
- Rwanda > Kigali
- Kigali (0.04)
- Ethiopia > Addis Ababa
- Addis Ababa (0.04)
- Rwanda > Kigali
- Genre:
- Research Report (1.00)
- Overview (1.00)
- Industry:
- Energy (0.92)
- Education (0.67)
- Health & Medicine
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Natural Language (1.00)
- Cognitive Science (1.00)
- Representation & Reasoning
- Uncertainty > Bayesian Inference (1.00)
- Optimization (1.00)
- Machine Learning
- Statistical Learning (1.00)
- Neural Networks > Deep Learning (1.00)
- Evolutionary Systems (1.00)
- Learning Graphical Models > Directed Networks
- Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence