Goto

Collaborating Authors

 Country


Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning

AI Magazine

Sequential decision tasks present many opportunities for the study of transfer learning. A principal one among them is the existence of multiple domains that share the same underlying causal structure for actions. We describe an approach that exploits this shared causal structure to discover a hierarchical task structure in a source domain, which in turn speeds up learning of task execution knowledge in a new target domain. Our approach is theoretically justified and compares favorably to manually designed task hierarchies in learning efficiency in the target domain. We demonstrate that causally motivated task hierarchies transfer more robustly than other kinds of detailed knowledge that depend on the idiosyncrasies of the source domain and are hence less transferable.


Quantum Structure in Cognition: Fundamentals and Applications

arXiv.org Artificial Intelligence

Experiments in cognitive science and decision theory show that the ways in which people combine concepts and make decisions cannot be described by classical logic and probability theory. This has serious implications for applied disciplines such as information retrieval, artificial intelligence and robotics. Inspired by a mathematical formalism that generalizes quantum mechanics the authors have constructed a contextual framework for both concept representation and decision making, together with quantum models that are in strong alignment with experimental data. The results can be interpreted by assuming the existence in human thought of a double-layered structure, a 'classical logical thought' and a 'quantum conceptual thought', the latter being responsible of the above paradoxes and nonclassical effects. The presence of a quantum structure in cognition is relevant, for it shows that quantum mechanics provides not only a useful modeling tool for experimental data but also supplies a structural model for human and artificial thought processes. This approach has strong connections with theories formalizing meaning, such as semantic analysis, and has also a deep impact on computer science, information retrieval and artificial intelligence. More specifically, the links with information retrieval are discussed in this paper.


Quantum Interaction Approach in Cognition, Artificial Intelligence and Robotics

arXiv.org Artificial Intelligence

The mathematical formalism of quantum mechanics has been successfully employed in the last years to model situations in which the use of classical structures gives rise to problematical situations, and where typically quantum effects, such as 'contextuality' and 'entanglement', have been recognized. This 'Quantum Interaction Approach' is briefly reviewed in this paper focusing, in particular, on the quantum models that have been elaborated to describe how concepts combine in cognitive science, and on the ensuing identification of a quantum structure in human thought. We point out that these results provide interesting insights toward the development of a unified theory for meaning and knowledge formalization and representation. Then, we analyze the technological aspects and implications of our approach, and a particular attention is devoted to the connections with symbolic artificial intelligence, quantum computation and robotics.


Adding noise to the input of a model trained with a regularized objective

arXiv.org Artificial Intelligence

Regularization is a well studied problem in the context of neural networks. It is usually used to improve the generalization performance when the number of input samples is relatively small or heavily contaminated with noise. The regularization of a parametric model can be achieved in different manners some of which are early stopping (Morgan and Bourlard, 1990), weight decay, output smoothing that are used to avoid overfitting during the training of the considered model. From a Bayesian point of view, many regularization techniques correspond to imposing certain prior distributions on model parameters (Krogh and Hertz, 1991). Using Bishop's approximation (Bishop, 1995) of the objective function when a restricted type of noise is added to the input of a parametric function, we derive the higher order terms of the Taylor expansion and analyze the coefficients of the regularization terms induced by the noisy input. In particular we study the effect of penalizing the Hessian of the mapping function with respect to the input in terms of generalization performance. We also show how we can control independently this coefficient by explicitly penalizing the Jacobian of the mapping function on corrupted inputs.


Polyethism in a colony of artificial ants

arXiv.org Artificial Intelligence

We explore self-organizing strategies for role assignment in a foraging task carried out by a colony of artificial agents. Our strategies are inspired by various mechanisms of division of labor (polyethism) observed in eusocial insects like ants, termites, or bees. Specifically we instantiate models of caste polyethism and age or temporal polyethism to evaluated the benefits to foraging in a dynamic environment. Our experiment is directly related to the exploration/exploitation trade of in machine learning.


Augmenting Tractable Fragments of Abstract Argumentation

arXiv.org Artificial Intelligence

We present a new and compelling approach to the efficient solution of important computational problems that arise in the context of abstract argumentation. Our approach makes known algorithms defined for restricted fragments generally applicable, at a computational cost that scales with the distance from the fragment. Thus, in a certain sense, we gradually augment tractable fragments. Surprisingly, it turns out that some tractable fragments admit such an augmentation and that others do not. More specifically, we show that the problems of credulous and skeptical acceptance are fixed-parameter tractable when parameterized by the distance from the fragment of acyclic argumentation frameworks. Other tractable fragments such as the fragments of symmetrical and bipartite frameworks seem to prohibit an augmentation: the acceptance problems are already intractable for frameworks at distance 1 from the fragments. For our study we use a broad setting and consider several different semantics. For the algorithmic results we utilize recent advances in fixed-parameter tractability.


Bayesian inference for queueing networks and modeling of internet services

arXiv.org Machine Learning

Modern Internet services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where each queue models one of the computers in the system. A key challenge is that the data are incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model parameters using Markov chain Monte Carlo. We evaluate our framework on data from a benchmark Web application. We also present a simple technique for selection among nested queueing models. We are unaware of any previous work that considers inference in networks of queues in the presence of missing data.


Slicing: Nonsingular Estimation of High Dimensional Covariance Matrices Using Multiway Kronecker Delta Covariance Structures

arXiv.org Machine Learning

Nonsingular estimation of high dimensional covariance matrices is an important step in many statistical procedures like classification, clustering, variable selection an future extraction. After a review of the essential background material, this paper introduces a technique we call slicing for obtaining a nonsingular covariance matrix of high dimensional data. Slicing is essentially assuming that the data has Kronecker delta covariance structure. Finally, we discuss the implications of the results in this paper and provide an example of classification for high dimensional gene expression data.


Foundations for Uniform Interpolation and Forgetting in Expressive Description Logics

arXiv.org Artificial Intelligence

We study uniform interpolation and forgetting in the description logic ALC. Our main results are model-theoretic characterizations of uniform inter- polants and their existence in terms of bisimula- tions, tight complexity bounds for deciding the existence of uniform interpolants, an approach to computing interpolants when they exist, and tight bounds on their size. We use a mix of model- theoretic and automata-theoretic methods that, as a by-product, also provides characterizations of and decision procedures for conservative extensions.


Kernels for Global Constraints

arXiv.org Artificial Intelligence

Bessiere et al. (AAAI'08) showed that several intractable global constraints can be efficiently propagated when certain natural problem parameters are small. In particular, the complete propagation of a global constraint is fixed-parameter tractable in k - the number of holes in domains - whenever bound consistency can be enforced in polynomial time; this applies to the global constraints AtMost-NValue and Extended Global Cardinality (EGC). In this paper we extend this line of research and introduce the concept of reduction to a problem kernel, a key concept of parameterized complexity, to the field of global constraints. In particular, we show that the consistency problem for AtMost-NValue constraints admits a linear time reduction to an equivalent instance on O(k^2) variables and domain values. This small kernel can be used to speed up the complete propagation of NValue constraints. We contrast this result by showing that the consistency problem for EGC constraints does not admit a reduction to a polynomial problem kernel unless the polynomial hierarchy collapses.