Goto

Collaborating Authors

 Country


Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

arXiv.org Machine Learning

We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x,y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may beused and the model can also be trained on unlabeled data. We demonstrate that energy based training of the joint distribution improves calibration, robustness, andout-of-distribution detection while also enabling our models to generate samplesrivaling the quality of recent GAN approaches. We improve upon recently proposed techniques for scaling up the training of energy based models and presentan approach which adds little overhead compared to standard classification training. Our approach is the first to achieve performance rivaling the state-of-the-artin both generative and discriminative learning within one hybrid model.


Scalable Bayesian Preference Learning for Crowds

arXiv.org Machine Learning

We propose a scalable Bayesian preference learning method for jointly predicting the preferences of individuals as well as the consensus of a crowd from pairwise labels. Peoples' opinions often differ greatly, making it difficult to predict their preferences from small amounts of personal data. Individual biases also make it harder to infer the consensus of a crowd when there are few labels per item. We address these challenges by combining matrix factorisation with Gaussian processes, using a Bayesian approach to account for uncertainty arising from noisy and sparse data. Our method exploits input features, such as text embeddings and user metadata, to predict preferences for new items and users that are not in the training set. As previous solutions based on Gaussian processes do not scale to large numbers of users, items or pairwise labels, we propose a stochastic variational inference approach that limits computational and memory costs. Our experiments on a recommendation task show that our method is competitive with previous approaches despite our scalable inference approximation. We demonstrate the method's scalability on a natural language processing task with thousands of users and items, and show improvements over the state of the art on this task. We make our software publicly available for future work.


IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

arXiv.org Machine Learning

The practical usage of reinforcement learning agents is often bottlenecked by the duration of training time. To accelerate training, practitioners often turn to distributed reinforcement learning architectures to parallelize and accelerate the training process. However, modern methods for scalable reinforcement learning (RL) often tradeoff between the throughput of samples that an RL agent can learn from (sample throughput) and the quality of learning from each sample (sample efficiency). In these scalable RL architectures, as one increases sample throughput (i.e. increasing parallelization in IMPALA), sample efficiency drops significantly. To address this, we propose a new distributed reinforcement learning algorithm, IMPACT. IMPACT extends IMPALA with three changes: a target network for stabilizing the surrogate objective, a circular buffer, and truncated importance sampling. In discrete action-space environments, we show that IMPACT attains higher reward and, simultaneously, achieves up to 30% decrease in training wall-time than that of IMPALA. For continuous control environments, IMPACT trains faster than existing scalable agents while preserving the sample efficiency of synchronous PPO.


Kernel-estimated Nonparametric Overlap-Based Syncytial Clustering

arXiv.org Machine Learning

Commonly-used clustering algorithms usually find ellipsoidal, spherical or other regular-structured clusters, but are more challenged when the underlying groups lack formal structure or definition. Syncytial clustering is the name that we introduce for methods that merge groups obtained from standard clustering algorithms in order to reveal complex group structure in the data. Here, we develop a distribution-free fully-automated syncytial clustering algorithm that can be used with $k$-means and other algorithms. Our approach computes the cumulative distribution function of the normed residuals from an appropriately fit $k$-groups model and calculates the nonparametric overlap between each pair of clusters. Groups with high pairwise overlap are merged as long as the generalized overlap decreases. Our methodology is always a top performer in identifying groups with regular and irregular structures in several datasets and can be applied to datasets with scatter or incomplete records. The approach is also used to identify the distinct kinds of gamma ray bursts in the Burst and Transient Source Experiment 4Br catalog and the distinct kinds of activation in a functional Magnetic Resonance Imaging study.


Classification des S{\'e}ries Temporelles Incertaines par Transformation Shapelet

arXiv.org Artificial Intelligence

Time serie classification is used in a diverse range of domain such as meteorology, medicine and physics. It aims to classify chronological data. Many accurate approaches have been built during the last decade and shapelet transformation is one of them. However, none of these approaches does take data uncertainty into account. Using uncertainty propagation techiniques, we propose a new dissimilarity measure based on euclidean distance. We also show how to use this new measure to adapt shapelet transformation to uncertain time series classification. An experimental assessment of our contribution is done on some state of the art datasets.


Graph Input Representations for Machine Learning Applications in Urban Network Analysis

arXiv.org Artificial Intelligence

A BSTRACT Understanding and learning the characteristics of network paths has been of particular interest for decades and has led to several successful applications. Such analysis becomes challenging for urban networks as their size and complexity are significantly higher compared to other networks. The state-of-the-art machine learning (ML) techniques allow us to detect hidden patterns and, thus, infer the features associated with them. However, very little is known about the impact on the performance of such predictive models by the use of different input representations. In this paper, we design and evaluate six different graph input representations (i.e., representations of the network paths), by considering the network's topological and temporal characteristics, for being used as inputs for machine learning models to learn the behavior of urban networks paths. The representations are validated and then tested with a real-world taxi journeys dataset predicting the tips using a road network of New Y ork. Our results demonstrate that the input representations that use temporal information help the model to achieve the highest accuracy (RMSE of 1.42$). K eywords Urban Networks, Graph Learning, Path Representation 1 Introduction Numerous important problems can be studied using the conceptual and theoretical framework of network science. Several structure and topological properties of networks have been widely studied in the recent years ([12, 14, 5, 9]). One of the most basic concepts in network science is the definition of network path ([3, 2]), i.e., a sequence of edges that joins a sequence of edges.


Exploration and Coordination of Complementary Multi-Robot Teams In a Hunter and Gatherer Scenario

arXiv.org Artificial Intelligence

This paper c onsider s the problem of dynamic task allocation, where tasks are unknowingly distributed over an environment. We aim to address the multi - robot exploration aspect of the problem, while solving the task - allocation aspect. To that end, we first propose a novel nature - inspired approach called "hunter and gatherer". W e consider each task comprised of two sequential su btasks: detection and completion, where each subtask can only be carried out by a certain type of agent. Thus, this approach employs two complementary teams of agents: one agile in detecting (hunters) and another dexterous in completing (gatherers) the tasks. Then, we propose a multi - robot exploration algorithm for hunters and a multi - robot task allocation algorithm for gatherer s, both in distributed manner and based on innovative notions of "certainty and uncertainty profit margins". Statistical analysis on simulation results confirm the efficacy of the proposed algorithms. Besides, it is statistically prove n that the proposed s olutions function fairly, i.e. for each type of agent, the overall workload is distributed equally. I. Introduction Multi - robot systems are expected to complete tasks that are unfeasible, laborious or inefficient for a single agent to accomplish [1] . Employing multi - robot systems entails addressing various problems on the subject of task allocation [2], exploration [3], coordination [4], learning [5], and heterogeneity [6] . Among all these problems, the problem of multi - robot task allocation (MRTA), assign ing a group of tasks to individual robots, is the most deep - seated problems of multi - robot systems, where its complexity increases considerably by a wide variety of factors. Regarding, a MRTA problem where tasks are unknowingly distributed over an environment needs to be addressed by solving the problem from both MRTA and multi - ro bot exploration perspectives. This problem can even get more complicated if each task is divided into two sequential subtasks and each subtask can only be carried out by a certain type of agent.


Founding The Domain of AI Forensics

arXiv.org Artificial Intelligence

With the widespread integration of AI in everyday and critical technologies, it seems inevitable to witness increasing instances of failure in AI systems. In such cases, there arises a need for technical investigations that produce legally acceptable and scientifically indisputable findings and conclusions on the causes of such failures. Inspired by the domain of cy-ber forensics, this paper introduces the need for the establishment of AI F orensics as a new discipline under AI safety. Furthermore, we propose a taxonomy of the subfields under this discipline, and present a discussion on the foundational challenges that lay ahead of this new research area. Introduction Recent advances in Artificial Intelligence (AI) have given rise to the rapidly growing adoption of such techniques by a vast array of industries and technologies.


AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

arXiv.org Artificial Intelligence

With the rise of knowledge graph (KG), question answering over knowledge base (KBQA) has attracted increasing attention in recent years. Despite much research has been conducted on this topic, it is still challenging to apply KBQA technology in industry because business knowledge and real-world questions can be rather complicated. In this paper, we present AliMe-KBQA, a bold attempt to apply KBQA in the E-commerce customer service field. To handle real knowledge and questions, we extend the classic "subject-predicate-object (SPO)" structure with property hierarchy, key-value structure and compound value type (CVT), and enhance traditional KBQA with constraints recognition and reasoning ability. We launch AliMe-KBQA in the Marketing Promotion scenario for merchants during the "Double 11" period in 2018 and other such promotional events afterwards. Online results suggest that AliMe-KBQA is not only able to gain better resolution and improve customer satisfaction, but also becomes the preferred knowledge management method by business knowledge staffs since it offers a more convenient and efficient management experience.


End-to-End Learning of Geometrical Shaping Maximizing Generalized Mutual Information

arXiv.org Artificial Intelligence

GMI-based end-to-end learning is shown to be highly nonconvex. We apply gradient descent initialized with Gray-labeled APSK constellations directly to the constellation coordinates. State-of-the-art constellations in 2D and 4D are found providing reach increases up to 26% w.r .t. to QAM. I NTRODUCTION S IGNAL shaping has recently received considerable attention in the literature and is now regarded as a key technique to improve throughput in high-speed fiberoptic systems. Shaping methods can be broadly categorized into probabilistic shaping (PS) and geometric shaping (GS), both having distinct advantages and disadvantages [1]-[3].