Goto

Collaborating Authors

 Country


Strong Asymptotic Assertions for Discrete MDL in Regression and Classification

arXiv.org Artificial Intelligence

We study the properties of the MDL (or maximum penalized complexity) estimator for Regression and Classification, where the underlying model class is countable. We show in particular a finite bound on the Hellinger losses under the only assumption that there is a "true" model contained in the class. This implies almost sure convergence of the predictive distribution to the true one at a fast rate. It corresponds to Solomonoff's central theorem of universal induction, however with a bound that is exponentially larger.


Reinforcement Learning for Agents with Many Sensors and Actuators Acting in Categorizable Environments

Journal of Artificial Intelligence Research

In this paper, we confront the problem of applying reinforcement learning to agents that perceive the environment through many sensors and that can perform parallel actions using many actuators as is the case in complex autonomous robots. We argue that reinforcement learning can only be successfully applied to this case if strong assumptions are made on the characteristics of the environment in which the learning is performed, so that the relevant sensor readings and motor commands can be readily identified. The introduction of such assumptions leads to strongly-biased learning systems that can eventually lose the generality of traditional reinforcement-learning algorithms. In this line, we observe that, in realistic situations, the reward received by the robot depends only on a reduced subset of all the executed actions and that only a reduced subset of the sensor inputs (possibly different in each situation and for each action) are relevant to predict the reward. We formalize this property in the so called 'categorizability assumption' and we present an algorithm that takes advantage of the categorizability of the environment, allowing a decrease in the learning time with respect to existing reinforcement-learning algorithms. Results of the application of the algorithm to a couple of simulated realistic-robotic problems (landmark-based navigation and the six-legged robot gait generation) are reported to validate our approach and to compare it to existing flat and generalization-based reinforcement-learning approaches.


Restricted Value Iteration: Theory and Algorithms

Journal of Artificial Intelligence Research

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs.


Combining Spatial and Temporal Logics: Expressiveness vs. Complexity

Journal of Artificial Intelligence Research

In this paper, we construct and investigate a hierarchy of spatio-temporal formalisms that result from various combinations of propositional spatial and temporal logics such as the propositional temporal logic PTL, the spatial logics RCC-8, BRCC-8, S4u and their fragments. The obtained results give a clear picture of the trade-off between expressiveness and `computational realisability' within the hierarchy. We demonstrate how different combining principles as well as spatial and temporal primitives can produce NP-, PSPACE-, EXPSPACE-, 2EXPSPACE-complete, and even undecidable spatio-temporal logics out of components that are at most NP- or PSPACE-complete.


Finding Approximate POMDP solutions Through Belief Compression

Journal of Artificial Intelligence Research

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional subspace embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this subspace can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta & Schapire, 2002) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks.


Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis

Neural Information Processing Systems

In applying Hidden Markov Models to the analysis of massive data streams, it is often necessary to use an artificially reduced set of states; this is due in large part to the fact that the basic HMM estimation algorithms have a quadratic dependence on the size of the state set. We present algorithms that reduce this computational bottleneck to linear or near-linear time, when the states can be embedded in an underlying grid of parameters. This type of state representation arises in many domains; in particular, we show an application to traffic analysis at a high-volume Web site.


Kernels for Structured Natural Language Data

Neural Information Processing Systems

In this paper, we focus on tasks in the application areas of NLP, such as Machine Translation, Text Summarization, Text Categorization and Question Answering.


An MCMC-Based Method of Comparing Connectionist Models in Cognitive Science

Neural Information Processing Systems

Despite the popularity of connectionist models in cognitive science, their performance can often be difficult to evaluate. Inspired by the geometric approach to statistical model selection, we introduce a conceptually similar method to examine the global behavior of a connectionist model, by counting the number and types of response patterns it can simulate. The Markov Chain Monte Carlo-based algorithm that we constructed Þnds these patterns efficiently. We demonstrate the approach using two localist network models of speech perception.


Application of SVMs for Colour Classification and Collision Detection with AIBO Robots

Neural Information Processing Systems

This article addresses the issues of colour classification and collision detection asthey occur in the legged league robot soccer environment of RoboCup. We show how the method of one-class classification with support vectormachines (SVMs) can be applied to solve these tasks satisfactorily usingthe limited hardware capacity of the prescribed Sony AIBO quadruped robots. The experimental evaluation shows an improvement over our previous methods of ellipse fitting for colour classification and the statistical approach used for collision detection.


On the Dynamics of Boosting

Neural Information Processing Systems

In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly be used to solve for Ada-Boost's output. By considering AdaBoost as a dynamical system, we are able to prove Rätsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a'nonoptimal' weaklearning algorithm.