Goto

Collaborating Authors

 Search


Classifier Chains: A Review and Perspectives

arXiv.org Artificial Intelligence

The family of methods collectively known as classifier chains has become a popular approach to multi-label learning problems. This approach involves linking together off-the-shelf binary classifiers in a chain structure, such that class label predictions become features for other classifiers. Such methods have proved flexible and effective and have obtained state-of-the-art empirical performance across many datasets and multi-label evaluation metrics. This performance led to further studies of how exactly it works, and how it could be improved, and in the recent decade numerous studies have explored classifier chains mechanisms on a theoretical level, and many improvements have been made to the training and inference procedures, such that this method remains among the state-of-the-art options for multi-label learning. Given this past and ongoing interest, which covers a broad range of applications and research themes, the goal of this work is to provide a review of classifier chains, a survey of the techniques and extensions provided in the literature, as well as perspectives for this approach in the domain of multi-label classification in the future. We conclude positively, with a number of recommendations for researchers and practitioners, as well as outlining a number of areas for future research.


Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem

arXiv.org Artificial Intelligence

Model-free reinforcement learning attempts to find an optimal control action for an unknown dynamical system by directly searching over the parameter space of controllers. The convergence behavior and statistical properties of these approaches are often poorly understood because of the nonconvex nature of the underlying optimization problems as well as the lack of exact gradient computation. In this paper, we take a step towards demystifying the performance and efficiency of such methods by focusing on the standard infinite-horizon linear quadratic regulator problem for continuous-time systems with unknown state-space parameters. We establish exponential stability for the ordinary differential equation (ODE) that governs the gradient-flow dynamics over the set of stabilizing feedback gains and show that a similar result holds for the gradient descent method that arises from the forward Euler discretization of the corresponding ODE. We also provide theoretical bounds on the convergence rate and sample complexity of a random search method. Our results demonstrate that the required simulation time for achieving $\epsilon$-accuracy in a model-free setup and the total number of function evaluations both scale as $\log \, (1/\epsilon)$.


Approximating Weighted and Priced Bribery in Scoring Rules

Journal of Artificial Intelligence Research

The classic Bribery problem is to find a minimal subset of voters who need to change their vote to make some preferred candidate win. Its important generalizations consider voters who are weighted and also have different prices. We provide an approximate solution for these problems for a broad family of scoring rules (which includes Borda, t-approval, and Dowdall), in the following sense: for constant weights and prices, if there exists a strategy which costs Ψ, we efficiently find a strategy which costs at most Ψ Õ( Ψ). An extension for non-constant weights and prices is also given. Our algorithm is based on a randomized reduction from these Bribery generalizations to weighted coalitional manipulation (WCM). To solve this WCM instance, we apply the Birkhoff-von Neumann (BvN) decomposition to a fractional manipulation matrix. This allows us to limit the size of the possible ballot search space reducing it from exponential to polynomial, while still obtaining good approximation guarantees. Finding a solution in the truncated search space yields a new algorithm for WCM, which is of independent interest.


PILS: Exploring high-order neighborhoods by pattern mining and injection

arXiv.org Artificial Intelligence

We introduce pattern injection local search (PILS), an optimization strategy that uses pattern mining to explore high-order local-search neighborhoods, and illustrate its application on the vehicle routing problem. PILS operates by storing a limited number of frequent patterns from elite solutions. During the local search, each pattern is used to define one move in which 1) incompatible edges are disconnected, 2) the edges defined by the pattern are reconnected, and 3) the remaining solution fragments are optimally reconnected. Each such move is accepted only in case of solution improvement. As visible in our experiments, this strategy results in a new paradigm of local search, which complements and enhances classical search approaches in a controllable amount of computational time. We demonstrate that PILS identifies useful high-order moves (e.g., 9-opt and 10-opt) which would otherwise not be found by enumeration, and that it significantly improves the performance of state-of-the-art population-based and neighborhood-centered metaheuristics.


Monte-Carlo Tree Search for Policy Optimization

arXiv.org Artificial Intelligence

Gradient-based methods are often used for policy optimization in deep reinforcement learning, despite being vulnerable to local optima and saddle points. Although gradient-free methods (e.g., genetic algorithms or evolution strategies) help mitigate these issues, poor initialization and local optima are still concerns in highly nonconvex spaces. This paper presents a method for policy optimization based on Monte-Carlo tree search and gradient-free optimization. Our method, called Monte-Carlo tree search for policy optimization (MCTSPO), provides a better exploration-exploitation trade-off through the use of the upper confidence bound heuristic. We demonstrate improved performance on reinforcement learning tasks with deceptive or sparse reward functions compared to popular gradient-based and deep genetic algorithm baselines.


Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices

arXiv.org Machine Learning

The first part of this paper is devoted to the decision-theoretic analysis of random-design linear prediction with square loss. It is known that, under boundedness constraints on the response (and thus regression coefficients), the minimax excess risk scales as $C\sigma^2d/n$ up to constants, where $d$ is the model dimension, $n$ the sample size, and $\sigma^2$ the noise parameter. Here, we study the expected excess risk with respect to the full linear class. We show that the ordinary least squares (OLS) estimator is minimax optimal in the well-specified case, for every distribution of covariates and noise level. Further, we express the minimax risk in terms of the distribution of statistical leverage scores of individual samples. We deduce a precise minimax lower bound of $\sigma^2d/(n-d+1)$, valid for any distribution of covariates, which nearly matches the risk of OLS for Gaussian covariates. We then obtain nonasymptotic upper bounds on the minimax risk for covariates that satisfy a "small ball"-type regularity condition, which scale as $(1+o(1))\sigma^2d/n$ as $d=o(n)$, both in the well-specified and misspecified cases. Our main technical contribution is the study of the lower tail of the smallest singular value of empirical covariance matrices around $0$. We establish a general lower bound on this lower tail, together with a matching upper bound under a necessary regularity condition. Our proof relies on the PAC-Bayesian technique for controlling empirical processes, and extends an analysis of Oliveira (2016) devoted to a different part of the lower tail. Equivalently, our upper bound shows that the operator norm of the inverse sample covariance matrix has bounded $L^q$ norm up to $q\asymp n$, and this exponent is unimprovable. Finally, we show that the regularity condition on the design naturally holds for independent coordinates.


TextNAS: A Neural Architecture Search Space tailored for Text Representation

arXiv.org Machine Learning

Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.


Learning Variable Ordering Heuristics for Solving Constraint Satisfaction Problems

arXiv.org Artificial Intelligence

Abstract--Backtracking search algorithms are often used to solve the Constraint Satisfaction Problem (CSP). The efficiency of backtracking search depends greatly on the variable ordering heuristics. Currently, the most commonly used heuristics are handcrafted based on expert knowledge. In this paper, we propose a deep reinforcement learning based approach to automatically discover new variable ordering heuristics that are better adapted for a given class of CSP instances. We show that directly optimizing the search cost is hard for bootstrapping, and propose to optimize the expected cost of reaching a leaf node in the search tree. T o capture the complex relations among the variables and constraints, we design a representation scheme based on Graph Neural Network that can process CSP instances with different sizes and constraint arities. Experimental results on random CSP instances show that the learned policies outperform classical handcrafted heuristics in terms of minimizing the search tree size, and can effectively generalize to instances that are larger than those used in training. Constraint Satisfaction Problem (CSP) is one of the most widely studied problems in computer science and artificial intelligence. It provides a common framework for modeling and solving combinatorial problems in many application domains, such as planning and scheduling [1], [2], vehicle routing [3], [4], graph problems [5], [6], and computational biology [7], [8]. A CSP instance involves a set of variables and constraints. T o solve it, one needs to find a value assignment for all variables such that all constraints are satisfied, or prove such assignment does not exist. Despite its ubiquitous applications, unfortunately, CSP is well known to be NPcomplete in general [9]. T o solve CSP efficiently, backtracking search algorithms are often employed, which are exact algorithms with the guarantee that a solution will be found if one exists.


A* Search

Communications of the ACM

Originally published in 1968 by Hart, Nilsson, and Raphael,2 the well-known A* search algorithm is a foundational pathfinding algorithm in computer science and artificial intelligence (AI) for traversing trees and graphs. The method provides the optimal path from the initial state to the target goal state, given the use of an admissible heuristic (must not overestimate the remaining distance to the goal). The A* algorithm is included in nearly all AI textbooks and courses worldwide. Given its widespread fame, however, there is no reliably documented evidence as to the origin of the name "A*": What does it really stand for and what does it mean? This Communications Viewpoint answers the question.


Uncertainty-sensitive Learning and Planning with Ensembles

arXiv.org Artificial Intelligence

We propose a reinforcement learning framework for discrete environments in which an agent makes both strategic and tactical decisions. The former manifests itself through the use of value function, while the latter is powered by a tree search planner. These tools complement each other. The planning module performs a local \textit{what-if} analysis, which allows to avoid tactical pitfalls and boost backups of the value function. The value function, being global in nature, compensates for inherent locality of the planner. In order to further solidify this synergy, we introduce an exploration mechanism with two distinctive components: uncertainty modelling and risk measurement. To model the uncertainty we use value function ensembles, and to reflect risk we use propose several functionals that summarize the implied by the ensemble. We show that our method performs well on hard exploration environments: Deep-sea, toy Montezuma's Revenge, and Sokoban. In all the cases, we obtain speed-up in learning and boost in performance.