sequential policy
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
- North America > United States > Missouri > St. Louis County > St. Louis (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
CANDID DAC: Leveraging Coupled Action Dimensions with Importance Differences in DAC
Bordne, Philipp, Hasan, M. Asif, Bergman, Eddie, Awad, Noor, Biedenkapp, André
High-dimensional action spaces remain a challenge for dynamic algorithm configuration (DAC). Interdependencies and varying importance between action dimensions are further known key characteristics of DAC problems. We argue that these Coupled Action Dimensions with Importance Differences (CANDID) represent aspects of the DAC problem that are not yet fully explored. To address this gap, we introduce a new white-box benchmark within the DACBench suite that simulates the properties of CANDID. Further, we propose sequential policies as an effective strategy for managing these properties. Such policies factorize the action space and mitigate exponential growth by learning a policy per action dimension. At the same time, these policies accommodate the interdependence of action dimensions by fostering implicit coordination. We show this in an experimental study of value-based policies on our new benchmark. This study demonstrates that sequential policies significantly outperform independent learning of factorized policies in CANDID action spaces. In addition, they overcome the scalability limitations associated with learning a single policy across all action dimensions. The code used for our experiments is available under https://github.com/PhilippBordne/candidDAC.
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Batch Bayesian Optimization via Simulation Matching
Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. There are a number of effective sequential policies for selecting the individual inputs. In many applications, however, it is desirable to perform multiple evaluations in parallel, which requires selecting batches of multiple inputs to evaluate at once. In this paper, we propose a novel approach to batch Bayesian optimization, providing a policy for selecting batches of inputs with the goal of optimizing the function as efficiently as possible.
Treatment recommendation with distributional targets
Kock, Anders Bredahl, Preinerstorfer, David, Veliyev, Bezirgen
We study the problem of a decision maker who must provide the best possible treatment recommendation based on an experiment. The desirability of the outcome distribution resulting from the policy recommendation is measured through a functional capturing the distributional characteristic that the decision maker is interested in optimizing. This could be, e.g., its inherent inequality, welfare, level of poverty or its distance to a desired outcome distribution. If the functional of interest is not quasi-convex or if there are constraints, the optimal recommendation may be a mixture of treatments. This vastly expands the set of recommendations that must be considered. We characterize the difficulty of the problem by obtaining maximal expected regret lower bounds. Furthermore, we propose two regret-optimal policies. The first policy is static and thus applicable irrespectively of the subjects arriving sequentially or not in the course of the experimental phase. The second policy can utilize that subjects arrive sequentially by successively eliminating inferior treatments and thus spends the sampling effort where it is most needed.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Batch Bayesian Optimization via Simulation Matching
Azimi, Javad, Fern, Alan, Fern, Xiaoli Z.
Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. There are a number of effective sequential policies for selecting the individual inputs. In many applications, however, it is desirable to perform multiple evaluations in parallel, which requires selecting batches of multiple inputs to evaluate at once. In this paper, we propose a novel approach to batch Bayesian optimization, providing a policy for selecting batches of inputs with the goal of optimizing the function as efficiently as possible.
Efficient Object Detection in Large Images using Deep Reinforcement Learning
Reinforcement Learning for Efficient Detection Reinforcement Learning (RL) has been recently used to (1) replace classical detectors such as SSD and Faster-RCNN, (2) replace exhaustive box proposal techniques in two-stage detectors, and (3) find ROIs in very large images to run a detector on. Most of the methods proposed in this categories focus on learning sequential policies. Under category (1), [3, 29] proposed a top-down sequential object detection models trained with Q-learning algorithm. Most of the RL methods associated with object detection fall into category (2). For example, [16] recursively divides up an image in a top-down approach where the divisions are decided by the RL agent. The box proposals returned by the agent are then passed through Fast-RCNN.
Efficient nonmyopic batch active search
Jiang, Shali, Malkomes, Gustavo, Abbott, Matthew, Moseley, Benjamin, Garnett, Roman
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We bridge this gap, addressing batch active search from both the theoretical and practical perspective. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the ``cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation. We conduct thorough experiments on data from three application domains: a citation network, material science, and drug discovery, testing all proposed policies (14 total) with a wide range of batch sizes. Our results demonstrate that the empirical performance gap matches our theoretical bound, that nonmyopic policies usually significantly outperform myopic alternatives, and that diversity is an important consideration for batch policy design.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
Efficient nonmyopic batch active search
Jiang, Shali, Malkomes, Gustavo, Abbott, Matthew, Moseley, Benjamin, Garnett, Roman
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We bridge this gap, addressing batch active search from both the theoretical and practical perspective. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the ``cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation. We conduct thorough experiments on data from three application domains: a citation network, material science, and drug discovery, testing all proposed policies (14 total) with a wide range of batch sizes. Our results demonstrate that the empirical performance gap matches our theoretical bound, that nonmyopic policies usually significantly outperform myopic alternatives, and that diversity is an important consideration for batch policy design.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Missouri > St. Louis County > St. Louis (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
Budgeted Optimization with Constrained Experiments
Azimi, Javad, Fern, Xiaoli, Fern, Alan
Motivated by a real-world problem, we study a novel budgeted optimization problem where the goal is to optimize an unknown function f(.) given a budget by requesting a sequence of samples from the function. In our setting, however, evaluating the function at precisely specified points is not practically possible due to prohibitive costs. Instead, we can only request constrained experiments. A constrained experiment, denoted by Q, specifies a subset of the input space for the experimenter to sample the function from. The outcome of Q includes a sampled experiment x, and its function output f(x). Importantly, as the constraints of Q become looser, the cost of fulfilling the request decreases, but the uncertainty about the location x increases. Our goal is to manage this trade-off by selecting a set of constrained experiments that best optimize f(.) within the budget. We study this problem in two different settings, the non-sequential (or batch) setting where a set of constrained experiments is selected at once, and the sequential setting where experiments are selected one at a time. We evaluate our proposed methods for both settings using synthetic and real functions. The experimental results demonstrate the efficacy of the proposed methods.
- North America > United States > Oregon (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)