Goto

Collaborating Authors

 Search


A Phase Space Approach to Minimax Entropy Learning and the Minutemax Approximations

Neural Information Processing Systems

There has been much recent work on measuring image statistics and on learning probability distributions on images. We observe that the mapping from images to statistics is many-to-one and show it can be quantified by a phase space factor. This phase space approach throws light on the Minimax Entropy technique for learning Gibbs distributions on images with potentials derived from image statistics and elucidates the ambiguities that are inherent to determining the potentials. In addition, it shows that if the phase factor can be approximated by an analytic distribution then this approximation yields a swift "Minutemax" algorithm that vastly reduces the computation time for Minimax entropy learning. An illustration of this concept, using a Gaussian to approximate the phase factor, gives a good approximation to the results of Zhu and Mumford (1997) in just seconds of CPU time.


Learning Instance-Independent Value Functions to Enhance Local Search

Neural Information Processing Systems

Reinforcement learning methods can be used to improve the performance of local search algorithms for combinatorial optimization by learning an evaluation function that predicts the outcome of search. The eval(cid:173) uation function is therefore able to guide search to low-cost solutions better than can the original cost function. We describe a reinforcement learning method for enhancing local search that combines aspects of pre(cid:173) vious work by Zhang and Dietterich (1995) and Boyan and Moore (1997, Boyan 1998). In an off-line learning phase, a value function is learned that is useful for guiding search for multiple problem sizes and instances. We illustrate our technique by developing several such functions for the Dial-A-Ride Problem.


Minimax Probability Machine

Neural Information Processing Systems

When constructing a classifier, the probability of correct classifi(cid:173) cation of future data points should be maximized. In the current paper this desideratum is translated in a very direct way into an optimization problem, which is solved using methods from con(cid:173) vex optimization. We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries. A worst-case bound on the probability of misclassification of future data is ob(cid:173) tained explicitly.


Incremental A*

Neural Information Processing Systems

Incremental search techniques find optimal solutions to series of similar search tasks much faster than is possible by solving each search task from scratch. While researchers have developed incremental versions of uninformed search methods, we develop an incremental version of A. The first search of Lifelong Planning A is the same as that of A* but all subsequent searches are much faster because it reuses those parts of the previous search tree that are identical to the new search tree. We then present experimental results that demonstrate the advantages of Lifelong Planning A* for simple route planning tasks. Most of the research on these methods has studied how to solve one-shot search problems. However, search is often a repetitive process, where one needs to solve a series of similar search tasks, for example, because the actual situation turns out to be slightly different from the one initially assumed or because the situation changes over time.


Constructing Distributed Representations Using Additive Clustering

Neural Information Processing Systems

Many cognitive models posit mental representations based on discrete substructures. Even connectionist models whose processing involves manipulation of real-valued activations typically represent objects as patterns of 0s and 1s across a set of units (Noelle, Cottrell, and Wilms, 1997). Often, individual units are taken to represent specific features of the objects and two representations will share features to the degree to which the two objects are similar. While this arrangement is intuitively appealing, it can be difficult to construct the features to be used in such a model. Using random feature assignments clouds the relationship between the model and the objects it is intended to represent, diminishing the model's value. As Clouse and Cottrell (1996) point out, hand-crafted representations are tedious to construct and it can be difficult to precisely justify (or even articulate) the principles that guided their design. These difficulties effectively limit the number of objects that can be encoded, constraining modeling efforts to small examples. In this paper, we investigate methods for automatically synthesizing feature-based representations directly from the pairwise object similarities that the model is intended to respect.


Minimax Differential Dynamic Programming: An Application to Robust Biped Walking

Neural Information Processing Systems

We developed a robust control policy design method in high-dimensional state space by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques from the optimal con- trol policy compared to a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dy- namic programming and the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated.


Speeding up the Parti-Game Algorithm

Neural Information Processing Systems

In this paper, we introduce an efficient replanning algorithm for nonde- terministic domains, namely what we believe to be the first incremental heuristic minimax search algorithm. We apply it to the dynamic dis- cretization of continuous domains, resulting in an efficient implemen- tation of the parti-game reinforcement-learning algorithm for control in high-dimensional domains.


A Formulation for Minimax Probability Machine Regression

Neural Information Processing Systems

We formulate the regression problem as one of maximizing the mini- mum probability, symbolized by (cid:10), that future predicted outputs of the regression model will be within some (cid:6)" bound of the true regression function. Our formulation is unique in that we obtain a direct estimate of this lower probability bound (cid:10). The proposed framework, minimax probability machine regression (MPMR), is based on the recently de- scribed minimax probability machine classification algorithm [Lanckriet et al.] and uses Mercer Kernels to obtain nonlinear regression models. MPMR is tested on both toy and real world data, verifying the accuracy of the (cid:10) bound, and the efficacy of the regression models.


Near-Minimax Optimal Classification with Dyadic Classification Trees

Neural Information Processing Systems

The classifiers are based on dyadic classification trees (DCTs), which involve adaptively pruned partitions of the feature space. A key aspect of DCTs is their spatial adaptivity, which enables lo- cal (rather than global) fitting of the decision boundary. Our risk analysis involves a spatial decomposition of the usual concentration inequalities, leading to a spatially adaptive, data-dependent pruning criterion. For any distribution on (X, Y) whose Bayes decision boundary behaves locally like a Lipschitz smooth function, we show that the DCT error converges to the Bayes error at a rate within a logarithmic factor of the minimax optimal rate. We also study DCTs equipped with polynomial classifica- tion rules at each leaf, and show that as the smoothness of the boundary increases their errors converge to the Bayes error at a rate approaching n 1/2, the parametric rate.


ARA*: Anytime A* with Provable Bounds on Sub-Optimality

Neural Information Processing Systems

In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasi- ble solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution.