Europe
Sparsity-Promoting Bayesian Dynamic Linear Models
Caron, François, Bornn, Luke, Doucet, Arnaud
Sparsity-promoting priors have become increasingly popular over recent years due to an increased number of regression and classification applications involving a large number of predictors. In time series applications where observations are collected over time, it is often unrealistic to assume that the underlying sparsity pattern is fixed. We propose here an original class of flexible Bayesian linear models for dynamic sparsity modelling. The proposed class of models expands upon the existing Bayesian literature on sparse regression using generalized multivariate hyperbolic distributions. The properties of the models are explored through both analytic results and simulation studies. We demonstrate the model on a financial application where it is shown that it accurately represents the patterns seen in the analysis of stock and derivative data, and is able to detect major events by filtering an artificial portfolio of assets.
mlpy: Machine Learning Python
Albanese, Davide, Visintainer, Roberto, Merler, Stefano, Riccadonna, Samantha, Jurman, Giuseppe, Furlanello, Cesare
We introduce here mlpy, a library providing access to a wide spectrum of machine learning methods implemented in Python, which has proven to be an effective environment for building scientific oriented tools (Pérez et al., 2011). Although planned for general purpose applications, mlpy has the computational biology in general, and the functional genomics modeling in particular, as the elective application fields. As a major applications example, we use mlpy methods to implement molecular profiling experiments that need to warrant study reproducibility(Ioannidis et al., 2009) and flawless results(Ambroise and McLachlan, 2002). This task requires the availability of highly modular tools allowing the practioners to build an adequate workflow for the task at hand following authoritative guidelines (The MicroArray Quality Control (MAQC) Consortium, 2010). Such workflow involves a complex sequence of steps, both in the development and in the validation phases, starting from the upstream preprocessing algorithms to the downstream predictive analysis, repeated several times to accommodate the resampling schema. The dimension of highthroughtput data involved (thousands of samples described by millions of features) and the large number of replicates needed to control bias effects make also efficiency an essential requirement.
Risk Bounds for CART Classifiers under a Margin Condition
Risk bounds for Classification and Regression Trees (CART, Breiman et. al. 1984) classifiers are obtained under a margin condition in the binary supervised classification framework. These risk bounds are obtained conditionally on the construction of the maximal deep binary tree and permit to prove that the linear penalty used in the CART pruning algorithm is valid under a margin condition. It is also shown that, conditionally on the construction of the maximal tree, the final selection by test sample does not alter dramatically the estimation accuracy of the Bayes classifier. In the two-class classification framework, the risk bounds that are proved, obtained by using penalized model selection, validate the CART algorithm which is used in many data mining applications such as Biology, Medicine or Image Coding.
A Bi-Directional Refinement Algorithm for the Calculus of (Co)Inductive Constructions
Asperti, Andrea, Ricciotti, Wilmer, Coen, Claudio Sacerdoti, Tassi, Enrico
The paper describes the refinement algorithm for the Calculus of (Co)Inductive Constructions (CIC) implemented in the interactive theorem prover Matita. The refinement algorithm is in charge of giving a meaning to the terms, types and proof terms directly written by the user or generated by using tactics, decision procedures or general automation. The terms are written in an "external syntax" meant to be user friendly that allows omission of information, untyped binders and a certain liberal use of user defined sub-typing. The refiner modifies the terms to obtain related well typed terms in the internal syntax understood by the kernel of the ITP. In particular, it acts as a type inference algorithm when all the binders are untyped. The proposed algorithm is bi-directional: given a term in external syntax and a type expected for the term, it propagates as much typing information as possible towards the leaves of the term. Traditional mono-directional algorithms, instead, proceed in a bottom-up way by inferring the type of a sub-term and comparing (unifying) it with the type expected by its context only at the end. We propose some novel bi-directional rules for CIC that are particularly effective. Among the benefits of bi-directionality we have better error message reporting and better inference of dependent types. Moreover, thanks to bi-directionality, the coercion system for sub-typing is more effective and type inference generates simpler unification problems that are more likely to be solved by the inherently incomplete higher order unification algorithms implemented. Finally we introduce in the external syntax the notion of vector of placeholders that enables to omit at once an arbitrary number of arguments. Vectors of placeholders allow a trivial implementation of implicit arguments and greatly simplify the implementation of primitive and simple tactics.
Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization
Dulac-Arnold, Gabriel, Denoyer, Ludovic, Preux, Philippe, Gallinari, Patrick
The use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches. The first method consists in using an ECOC-based classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(Alog(A)). We then propose a novel method that profits from the ECOC's coding dictionary to split the initial MDP into O(log(A)) seperate two-action MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)), thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance.
Consistency Techniques for Flow-Based Projection-Safe Global Cost Functions in Weighted Constraint Satisfaction
Many combinatorial problems deal with preferences and violations, the goal of which is to find solutions with the minimum cost. Weighted constraint satisfaction is a framework for modeling such problems, which consists of a set of cost functions to measure the degree of violation or preferences of different combinations of variable assignments. Typical solution methods for weighted constraint satisfaction problems (WCSPs) are based on branch-and-bound search, which are made practical through the use of powerful consistency techniques such as AC*, FDAC*, EDAC* to deduce hidden cost information and value pruning during search. These techniques, however, are designed to be efficient only on binary and ternary cost functions which are represented in table form. In tackling many real-life problems, high arity (or global) cost functions are required. We investigate efficient representation scheme and algorithms to bring the benefits of the consistency techniques to also high arity cost functions, which are often derived from hard global constraints from classical constraint satisfaction. The literature suggests some global cost functions can be represented as flow networks, and the minimum cost flow algorithm can be used to compute the minimum costs of such networks in polynomial time. We show that naive adoption of this flow-based algorithmic method for global cost functions can result in a stronger form of null-inverse consistency. We further show how the method can be modified to handle cost projections and extensions to maintain generalized versions of AC* and FDAC* for cost functions with more than two variables. Similar generalization for the stronger EDAC* is less straightforward. We reveal the oscillation problem when enforcing EDAC* on cost functions sharing more than one variable. To avoid oscillation, we propose a weak version of EDAC* and generalize it to weak EDGAC* for non-binary cost functions. Using various benchmarks involving the soft variants of hard global constraints ALLDIFFERENT, GCC, SAME, and REGULAR, empirical results demonstrate that our proposal gives improvements of up to an order of magnitude when compared with the traditional constraint optimization approach, both in terms of time and pruning.
One Decade of Universal Artificial Intelligence
The first decade of this century has seen the nascency of the first mathematical theory of general artificial intelligence. This theory of Universal Artificial Intelligence (UAI) has made significant contributions to many theoretical, philosophical, and practical AI questions. In a series of papers culminating in book (Hutter, 2005), an exciting sound and complete mathematical model for a super intelligent agent (AIXI) has been developed and rigorously analyzed. While nowadays most AI researchers avoid discussing intelligence, the award-winning PhD thesis (Legg, 2008) provided the philosophical embedding and investigated the UAI-based universal measure of rational intelligence, which is formal, objective and non-anthropocentric. Recently, effective approximations of AIXI have been derived and experimentally investigated in JAIR paper (Veness et al. 2011). This practical breakthrough has resulted in some impressive applications, finally muting earlier critique that UAI is only a theory. For the first time, without providing any domain knowledge, the same agent is able to self-adapt to a diverse range of interactive environments. For instance, AIXI is able to learn from scratch to play TicTacToe, Pacman, Kuhn Poker, and other games by trial and error, without even providing the rules of the games. These achievements give new hope that the grand goal of Artificial General Intelligence is not elusive. This article provides an informal overview of UAI in context. It attempts to gently introduce a very theoretical, formal, and mathematical subject, and discusses philosophical and technical ingredients, traits of intelligence, some social questions, and the past and future of UAI.
Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams
We focus on the problem of sequential decision making in partially observable environments shared with other agents of uncertain types having similar or conflicting objectives. This problem has been previously formalized by multiple frameworks one of which is the interactive dynamic influence diagram (I-DID), which generalizes the well-known influence diagram to the multiagent setting. I-DIDs are graphical models and may be used to compute the policy of an agent given its belief over the physical state and others' models, which changes as the agent acts and observes in the multiagent setting. As we may expect, solving I-DIDs is computationally hard. This is predominantly due to the large space of candidate models ascribed to the other agents and its exponential growth over time. We present two methods for reducing the size of the model space and stemming its exponential growth. Both these methods involve aggregating individual models into equivalence classes. Our first method groups together behaviorally equivalent models and selects only those models for updating which will result in predictive behaviors that are distinct from others in the updated model space. The second method further compacts the model space by focusing on portions of the behavioral predictions. Specifically, we cluster actionally equivalent models that prescribe identical actions at a single time step. Exactly identifying the equivalences would require us to solve all models in the initial set. We avoid this by selectively solving some of the models, thereby introducing an approximation. We discuss the error introduced by the approximation, and empirically demonstrate the improved efficiency in solving I-DIDs due to the equivalences.
Counting-Based Search: Branching Heuristics for Constraint Satisfaction Problems
Pesant, G., Quimper, C., Zanarini, A.
Designing a search heuristic for constraint programming that is reliable across problem domains has been an important research topic in recent years. This paper concentrates on one family of candidates: counting-based search. Such heuristics seek to make branching decisions that preserve most of the solutions by determining what proportion of solutions to each individual constraint agree with that decision. Whereas most generic search heuristics in constraint programming rely on local information at the level of the individual variable, our search heuristics are based on more global information at the constraint level. We design several algorithms that are used to count the number of solutions to specific families of constraints and propose some search heuristics exploiting such information. The experimental part of the paper considers eight problem domains ranging from well-established benchmark puzzles to rostering and sport scheduling. An initial empirical analysis identifies heuristic maxSD as a robust candidate among our proposals. We then evaluate the latter against the state of the art, including the latest generic search heuristics, restarts, and discrepancy-based tree traversals. Experimental results show that counting-based search generally outperforms other generic heuristics.
Type-elimination-based reasoning for the description logic SHIQbs using decision diagrams and disjunctive datalog
Rudolph, Sebastian, Krötzsch, Markus, Hitzler, Pascal
We propose a novel, type-elimination-based method for reasoning in the description logic SHIQbs including DL-safe rules. To this end, we first establish a knowledge compilation method converting the terminological part of an ALCIb knowledge base into an ordered binary decision diagram (OBDD) which represents a canonical model. This OBDD can in turn be transformed into disjunctive Datalog and merged with the assertional part of the knowledge base in order to perform combined reasoning. In order to leverage our technique for full SHIQbs, we provide a stepwise reduction from SHIQbs to ALCIb that preserves satisfiability and entailment of positive and negative ground facts. The proposed technique is shown to be worst case optimal w.r.t. combined and data complexity and easily admits extensions with ground conjunctive queries.