Goto

Collaborating Authors

 Europe


Threshold Choice Methods: the Missing Link

arXiv.org Artificial Intelligence

Many performance metrics have been introduced for the evaluation of classification performance, with different origins and niches of application: accuracy, macro-accuracy, area under the ROC curve, the ROC convex hull, the absolute error, and the Brier score (with its decomposition into refinement and calibration). One way of understanding the relation among some of these metrics is the use of variable operating conditions (either in the form of misclassification costs or class proportions). Thus, a metric may correspond to some expected loss over a range of operating conditions. One dimension for the analysis has been precisely the distribution we take for this range of operating conditions, leading to some important connections in the area of proper scoring rules. However, we show that there is another dimension which has not received attention in the analysis of performance metrics. This new dimension is given by the decision rule, which is typically implemented as a threshold choice method when using scoring models. In this paper, we explore many old and new threshold choice methods: fixed, score-uniform, score-driven, rate-driven and optimal, among others. By calculating the loss of these methods for a uniform range of operating conditions we get the 0-1 loss, the absolute error, the Brier score (mean squared error), the AUC and the refinement loss respectively. This provides a comprehensive view of performance metrics as well as a systematic approach to loss minimisation, namely: take a model, apply several threshold choice methods consistent with the information which is (and will be) available about the operating condition, and compare their expected losses. In order to assist in this procedure we also derive several connections between the aforementioned performance metrics, and we highlight the role of calibration in choosing the threshold choice method.


Dynamic trees for streaming and massive data contexts

arXiv.org Machine Learning

Data collection at a massive scale is becoming ubiquitous in a wide variety of settings, from vast offline databases to streaming real-time information. Learning algorithms deployed in such contexts must rely on single-pass inference, where the data history is never revisited. In streaming contexts, learning must also be temporally adaptive to remain up-to-date against unforeseen changes in the data generating mechanism. Although rapidly growing, the online Bayesian inference literature remains challenged by massive data and transient, evolving data streams. Non-parametric modelling techniques can prove particularly ill-suited, as the complexity of the model is allowed to increase with the sample size. In this work, we take steps to overcome these challenges by porting standard streaming techniques, like data discarding and downweighting, into a fully Bayesian framework via the use of informative priors and active learning heuristics. We showcase our methods by augmenting a modern non-parametric modelling framework, dynamic trees, and illustrate its performance on a number of practical examples. The end product is a powerful streaming regression and classification tool, whose performance compares favourably to the state-of-the-art.


Robust Local Search for Solving RCPSP/max with Durational Uncertainty

Journal of Artificial Intelligence Research

Scheduling problems in manufacturing, logistics and project management have frequently been modeled using the framework of Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max). Due to the importance of these problems, providing scalable solution schedules for RCPSP/max problems is a topic of extensive research. However, all existing methods for solving RCPSP/max assume that durations of activities are known with certainty, an assumption that does not hold in real world scheduling problems where unexpected external events such as manpower availability, weather changes, etc. lead to delays or advances in completion of activities. Thus, in this paper, our focus is on providing a scalable method for solving RCPSP/max problems with durational uncertainty. To that end, we introduce the robust local search method consisting of three key ideas: (a) Introducing and studying the properties of two decision rule approximations used to compute start times of activities with respect to dynamic realizations of the durational uncertainty; (b) Deriving the expression for robust makespan of an execution strategy based on decision rule approximations; and (c) A robust local search mechanism to efficiently compute activity execution strategies that are robust against durational uncertainty. Furthermore, we also provide enhancements to local search that exploit temporal dependencies between activities. Our experimental results illustrate that robust local search is able to provide robust execution strategies efficiently.


A metric learning perspective of SVM: on the relation of SVM and LMNN

arXiv.org Machine Learning

Support Vector Machines, SVMs, and the Large Margin Nearest Neighbor algorithm, LMNN, are two very popular learning algorithms with quite different learning biases. In this paper we bring them into a unified view and show that they have a much stronger relation than what is commonly thought. We analyze SVMs from a metric learning perspective and cast them as a metric learning problem, a view which helps us uncover the relations of the two algorithms. We show that LMNN can be seen as learning a set of local SVM-like models in a quadratic space. Along the way and inspired by the metric-based interpretation of SVM s we derive a novel variant of SVMs, epsilon-SVM, to which LMNN is even more similar. We give a unified view of LMNN and the different SVM variants. Finally we provide some preliminary experiments on a number of benchmark datasets in which show that epsilon-SVM compares favorably both with respect to LMNN and SVM.


Adaptive and Optimal Online Linear Regression on L1-balls

arXiv.org Machine Learning

We consider the problem of online linear regression on individual sequences. The goal in this paper is for the forecaster to output sequential predictions which are, after T time rounds, almost as good as the ones output by the best linear predictor in a given L1-ball in R^d. We consider both the cases where the dimension d is small and large relative to the time horizon T. We first present regret bounds with optimal dependencies on the sizes U, X and Y of the L1-ball, the input data and the observations. The minimax regret is shown to exhibit a regime transition around the point d = sqrt(T) U X / (2 Y). Furthermore, we present efficient algorithms that are adaptive, i.e., they do not require the knowledge of U, X, and Y, but still achieve nearly optimal regret bounds.


Learning and Reasoning with Action-Related Places for Robust Mobile Manipulation

Journal of Artificial Intelligence Research

We propose the concept of Action-Related Place (ARPlace) as a powerful and flexible representation of task-related place in the context of mobile manipulation. ARPlace represents robot base locations not as a single position, but rather as a collection of positions, each with an associated probability that the manipulation action will succeed when located there. ARPlaces are generated using a predictive model that is acquired through experience-based learning, and take into account the uncertainty the robot has about its own location and the location of the object to be manipulated. When executing the task, rather than choosing one specific goal position based only on the initial knowledge about the task context, the robot instantiates an ARPlace, and bases its decisions on this ARPlace, which is updated as new information about the task becomes available. To show the advantages of this least-commitment approach, we present a transformational planner that reasons about ARPlaces in order to optimize symbolic plans. Our empirical evaluation demonstrates that using ARPlaces leads to more robust and efficient mobile manipulation in the face of state estimation uncertainty on our simulated robot.


Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis

arXiv.org Artificial Intelligence

We present a technique for the animation of a 3D kinematic tongue model, one component of the talking head of an acoustic-visual (AV) speech synthesizer. The skeletal animation approach is adapted to make use of a deformable rig controlled by tongue motion capture data obtained with electromagnetic articulography (EMA), while the tongue surface is extracted from volumetric magnetic resonance imaging (MRI) data. Initial results are shown and future work outlined.


Dynamic Shared Context Processing in an E-Collaborative Learning Environment

arXiv.org Artificial Intelligence

In this paper, we propose a dynamic shared context processing method based on DSC (Dynamic Shared Context) model, applied in an e-collaborative learning environment. Firstly, we present the model. This is a way to measure the relevance between events and roles in collaborative environments. With this method, we can share the most appropriate event information for each role instead of sharing all information to all roles in a collaborative work environment. Then, we apply and verify this method in our project with Google App supported e-learning collaborative environment. During this experiment, we compared DSC method measured relevance of events and roles to manual measured relevance. And we describe the favorable points from this comparison and our finding. Finally, we discuss our future research of a hybrid DSC method to make dynamical information shared more effective in a collaborative work environment.


Modelling and simulation of complex systems: an approach based on multi-level agents

arXiv.org Artificial Intelligence

A complex system is made up of many components with many interactions. So the design of systems such as simulation systems, cooperative systems or assistance systems includes a very accurate modelling of interactional and communicational levels. The agent-based approach provides an adapted abstraction level for this problem. After having studied the organizational context and communicative capacities of agentbased systems, to simulate the reorganization of a flexible manufacturing, to regulate an urban transport system, and to simulate an epidemic detection system, our thoughts on the interactional level were inspired by human-machine interface models, especially those in "cognitive engineering". To provide a general framework for agent-based complex systems modelling, we then proposed a scale of four behaviours that agents may adopt in their complex systems (reactive, routine, cognitive, and collective). To complete the description of multi-level agent models, which is the focus of this paper, we illustrate our modelling and discuss our ongoing work on each level.


Learning Functions of Few Arbitrary Linear Parameters in High Dimensions

arXiv.org Machine Learning

Let us assume that $f$ is a continuous function defined on the unit ball of $\mathbb R^d$, of the form $f(x) = g (A x)$, where $A$ is a $k \times d$ matrix and $g$ is a function of $k$ variables for $k \ll d$. We are given a budget $m \in \mathbb N$ of possible point evaluations $f(x_i)$, $i=1,...,m$, of $f$, which we are allowed to query in order to construct a uniform approximating function. Under certain smoothness and variation assumptions on the function $g$, and an {\it arbitrary} choice of the matrix $A$, we present in this paper 1. a sampling choice of the points $\{x_i\}$ drawn at random for each function approximation; 2. algorithms (Algorithm 1 and Algorithm 2) for computing the approximating function, whose complexity is at most polynomial in the dimension $d$ and in the number $m$ of points. Due to the arbitrariness of $A$, the choice of the sampling points will be according to suitable random distributions and our results hold with overwhelming probability. Our approach uses tools taken from the {\it compressed sensing} framework, recent Chernoff bounds for sums of positive-semidefinite matrices, and classical stability bounds for invariant subspaces of singular value decompositions.