Goto

Collaborating Authors

 Search


Bayesian Optimization with Resource Constraints and Production

AAAI Conferences

In this paper, we aim to take a step toward a tighter integration of automated planning and Bayesian Optimization (BO). BO is an approach for optimizing costly-to-evaluate functions by selecting a limited number of experiments that each evaluate the function at a specified input. Typical BO formulations assume that experiments are selected one at a time, or in fixed batches, and that experiments can be executed immediately upon request. This setup fails to capture many real-world domains where the execution of an experiment requires setup and preparation time. In this paper, we define a novel BO problem formulation that models the resources and activities needed to prepare and run experiments. We then present a planning approach, based on finite-horizon tree search, for scheduling the potentially concurrent experimental activities with the aim of best optimizing the function within a limited time horizon. A key element of the approach is a novel state evaluation function for evaluating leaves of the search tree, for which we prove approximate guarantees. We evaluate the approach on a number of diverse benchmark problems and show that it produces high-quality results compared to a number of natural baselines.


Search Portfolio with Sharing

AAAI Conferences

Over the years, a number of search algorithms have been proposed in AI literature, ranging from best-first to depth-first searches, from incomplete to optimal searches, from linear memory to unbounded memory searches; each having their strengths and weaknesses. The variability in performance of these algorithms makes algorithm selection a hard problem, especially for performance critical domains. Algorithm portfolios alleviate this problem by simultaneously running multiple algorithms to solve a given problem instance, exploiting their diversity. In general, the portfolio methods do not share information among candidate algorithms. Our work is based on the observation that if the algorithms within a portfolio can share information, it may significantly enhance the performance, as one algorithm can now utilize partial results computed by other algorithms. To this end, we introduce a new search framework, called Search Portfolio with Sharing (SP-S), which uses multiple algorithms to explore a given state-space in an integrated manner, seamlessly combining the partial solutions, while preserving the constraints/characteristics of the candidate algorithms. In addition, SP-S can be easily adopted to guarantee theoretical properties like completeness, bounded sub-optimality, and bounded re-expansions. We describe the basics of the SP-S framework and explain how different classes of search algorithms can be integrated in SP-S. We discuss its theoretical properties and present experimental results for multiple domains, demonstrating the utility of such a shared approach.


Monte Carlo tree search - Wikipedia, the free encyclopedia

#artificialintelligence

In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in game play. A leading example is recent computer Go programs,[1] but it also has been used in other board games, as well as real-time video games and non-deterministic games such as poker (see history section). The focus of Monte Carlo tree search is on the analysis of the most promising moves, expanding the search tree based on random sampling of the search space. The application of Monte Carlo tree search in games is based on many playouts. In each playout, the game is played-out to the very end by selecting moves at random.


Scaling Submodular Maximization via Pruned Submodularity Graphs

arXiv.org Machine Learning

We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization. The pruning is applied via a "submodularity graph" over the $n$ ground elements, where each directed edge is associated with a pairwise dependency defined by the submodular function. In each step, SS prunes a $1-1/\sqrt{c}$ (for $c>1$) fraction of the nodes using weights on edges computed based on only a small number ($O(\log n)$) of randomly sampled nodes. The algorithm requires $\log_{\sqrt{c}}n$ steps with a small and highly parallelizable per-step computation. An accuracy-speed tradeoff parameter $c$, set as $c = 8$, leads to a fast shrink rate $\sqrt{2}/4$ and small iteration complexity $\log_{2\sqrt{2}}n$. Analysis shows that w.h.p., the greedy algorithm on the pruned set of size $O(\log^2 n)$ can achieve a guarantee similar to that of processing the original dataset. In news and video summarization tasks, SS is able to substantially reduce both computational costs and memory usage, while maintaining (or even slightly exceeding) the quality of the original (and much more costly) greedy algorithm.


KF: Escaping the Local Minimum

#artificialintelligence

This report is my final project for the MIT Media Lab Class "Integrative Theories of Mind and Cognition" (also known as Future of AI, and New Destinations in Artificial Intelligence) in Spring 2016. Artificial Intelligence performs gradient descent. The AI field discovers a path of success, and then travels that path until progress stops (when a local minimum is reached). Then, the field resets and chooses a new path, thus repeating the process. If this trend continues, AI should soon reach a local minimum, causing the next AI winter. However, recent methods provide an opportunity to escape the local minimum. To continue recent success, it is necessary to compare the current progress to all prior progress in AI. I begin this paper by pointing out a concerning pattern in the field of AI and describing how it can be useful to model the field's behavior. The paper is then divided into two main sections. In the first section of this paper, I argue that the field of artificial intelligence, itself, has been performing gradient descent. I catalog a repeating trend in the field: a string of successes, followed by a sudden crash, followed by a change in direction. In the second section, I describe steps that should be taken to prevent the current trends from falling into a local minimum. I present a number of examples from the past that deep learning techniques are currently unable to accomplish. Finally, I summarize my findings and conclude by reiterating the use of the gradient descent model.


Artificial Intelligence Popular Search Algorithms

#artificialintelligence

Searching is the universal technique of problem solving in AI. There are some single-player games such as tile games, Sudoku, crossword, etc. The search algorithms help you to search for a particular position in such games. The games such as 3X3 eight-tile, 4X4 fifteen-tile, and 5X5 twenty four tile puzzles are single-agent-path-finding challenges. They consist of a matrix of tiles with a blank tile.


Optimal Any-Angle Pathfinding In Practice

Journal of Artificial Intelligence Research

Any-angle pathfinding is a fundamental problem in robotics and computer games. The goal is to find a shortest path between a pair of points on a grid map such that the path is not artificially constrained to the points of the grid. Prior research has focused on approximate online solutions. A number of exact methods exist but they all require super-linear space and pre-processing time. In this study, we describe Anya: a new and optimal any-angle pathfinding algorithm. Where other works find approximate any-angle paths by searching over individual points from the grid, Anya finds optimal paths by searching over sets of states represented as intervals. Each interval is identified on-the-fly. From each interval Anya selects a single representative point that it uses to compute an admissible cost estimate for the entire set. Anya always returns an optimal path if one exists. Moreover it does so without any offline pre-processing or the introduction of additional memory overheads. In a range of empirical comparisons we show that Anya is competitive with several recent (sub-optimal) online and pre-processing based techniques and is up to an order of magnitude faster than the most common benchmark algorithm, a grid-based implementation of A*.


Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

arXiv.org Machine Learning

We consider the problem of estimating a function defined over $n$ locations on a $d$-dimensional grid (having all side lengths equal to $n^{1/d}$). When the function is constrained to have discrete total variation bounded by $C_n$, we derive the minimax optimal (squared) $\ell_2$ estimation error rate, parametrized by $n$ and $C_n$. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Several simpler estimators exist, such as Laplacian smoothing and Laplacian eigenmaps. A natural question is: can these simpler estimators perform just as well? We prove that these estimators, and more broadly all estimators given by linear transformations of the input data, are suboptimal over the class of functions with bounded variation. This extends fundamental findings of Donoho and Johnstone [1998] on 1-dimensional total variation spaces to higher dimensions. The implication is that the computationally simpler methods cannot be used for such sophisticated denoising tasks, without sacrificing statistical accuracy. We also derive minimax rates for discrete Sobolev spaces over $d$-dimensional grids, which are, in some sense, smaller than the total variation function spaces. Indeed, these are small enough spaces that linear estimators can be optimal---and a few well-known ones are, such as Laplacian smoothing and Laplacian eigenmaps, as we show. Lastly, we investigate the problem of adaptivity of the total variation denoiser to these smaller Sobolev function spaces.


Combinatorial Topic Models using Small-Variance Asymptotics

arXiv.org Machine Learning

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In contrast, we study topic modeling as a combinatorial optimization problem, and propose a new objective function derived from LDA by passing to the small-variance limit. We minimize the derived objective by using ideas from combinatorial optimization, which results in a new, fast, and high-quality topic modeling algorithm. In particular, we show that our results are competitive with popular LDA-based topic modeling approaches, and also discuss the (dis)similarities between our approach and its probabilistic counterparts.


Researchers create Rubik's cube-like touchscreen display

Engadget

In a case proposed by the research team, a device like a flat smartphone could be folded and reconfigured into the shape of a game controller. Or, in a less practical example, you could simply roll your phone out into a rectangular log with a postage-stamp sized display on one end. For users who were never very good at spatial reasoning or origami, an algorithm will help determine the best way to twist and fold the screen into the desired shape. While the device is still in the awkward prototype phase at this point, the research team will present it to a panel at the International Conference on Robotics and Automation in Stockholm later this week.