batch active search
Efficient nonmyopic batch active search
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition.
- North America > United States > Missouri > St. Louis County > St. Louis (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
Reviews: Efficient nonmyopic batch active search
This work investigates the different batch-mode extensions of an active search method called efficient nonmyopic policy (ENS) [12]. ENS achieves good performance efficiently because it assumes the sample selections are independent after a step [12]. This paper proposes two strategies: 1) converting the batch active search problem to sequential one by guessing the hidden labels of selected samples 2) try to enumerate all possible hidden labels of selected samples by Monte Carlo. Strength: Allowing to sample batches is important in practice. The work addresses several theoretical and practical challenges in many aspects of batch active search such as how difficult batch active search could be, why pessimistic oracle works well, and how to make the methods more efficient by pruning.
Efficient nonmyopic batch active search
Jiang, Shali, Malkomes, Gustavo, Abbott, Matthew, Moseley, Benjamin, Garnett, Roman
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation.
Efficient nonmyopic batch active search
Jiang, Shali, Malkomes, Gustavo, Abbott, Matthew, Moseley, Benjamin, Garnett, Roman
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We bridge this gap, addressing batch active search from both the theoretical and practical perspective. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the ``cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation. We conduct thorough experiments on data from three application domains: a citation network, material science, and drug discovery, testing all proposed policies (14 total) with a wide range of batch sizes. Our results demonstrate that the empirical performance gap matches our theoretical bound, that nonmyopic policies usually significantly outperform myopic alternatives, and that diversity is an important consideration for batch policy design.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)
Efficient nonmyopic batch active search
Jiang, Shali, Malkomes, Gustavo, Abbott, Matthew, Moseley, Benjamin, Garnett, Roman
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We bridge this gap, addressing batch active search from both the theoretical and practical perspective. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the ``cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation. We conduct thorough experiments on data from three application domains: a citation network, material science, and drug discovery, testing all proposed policies (14 total) with a wide range of batch sizes. Our results demonstrate that the empirical performance gap matches our theoretical bound, that nonmyopic policies usually significantly outperform myopic alternatives, and that diversity is an important consideration for batch policy design.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Missouri > St. Louis County > St. Louis (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Lebanon (0.04)