Goto

Collaborating Authors

 Country


A User Study of Perceived Carbon Footprint

arXiv.org Machine Learning

We propose a statistical model to understand people's perception of their carbon footprint. Driven by the observation that few people think of CO2 impact in absolute terms, we design a system to probe people's perception from simple pairwise comparisons of the relative carbon footprint of their actions. The formulation of the model enables us to take an active-learning approach to selecting the pairs of actions that are maximally informative about the model parameters. We define a set of 18 actions and collect a dataset of 2183 comparisons from 176 users on a university campus. The early results reveal promising directions to improve climate communication and enhance climate mitigation.


Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

arXiv.org Machine Learning

Proposed by Donoho (1997), Dyadic CART is a nonparametric regression method which computes a globally optimal dyadic decision tree and fits piecewise constant functions. In this article we define and study Dyadic CART and a closely related estimator, namely Optimal Regression Tree (ORT), in the context of estimating piecewise smooth functions in general dimensions. More precisely, these optimal decision tree estimators fit piecewise polynomials of any given degree. Like Dyadic CART in two dimensions, we reason that these estimators can also be computed in polynomial time in the sample size via dynamic programming. We prove oracle inequalities for the finite sample risk of Dyadic CART and ORT which imply tight risk bounds for several function classes of interest. Firstly, they imply that the finite sample risk of ORT of order $r \geq 0$ is always bounded by $C k \frac{\log N}{N}$ ($N$ is the sample size) whenever the regression function is piecewise polynomial of degree $r$ on some reasonably regular axis aligned rectangular partition of the domain with at most $k$ rectangles. Beyond the univariate case, such guarantees are scarcely available in the literature for computationally efficient estimators. Secondly, our oracle inequalities uncover optimality and adaptivity of the Dyadic CART estimator for function spaces with bounded variation. We consider two function spaces of recent interest where multivariate total variation denoising and univariate trend filtering are the state of the art methods. We show that Dyadic CART enjoys certain advantages over these estimators while still maintaining all their known guarantees.


Discrete and Continuous Deep Residual Learning Over Graphs

arXiv.org Machine Learning

Pedro H.C. Avelar Anderson R. Tavares Marco Gori † Luis C. Lamb Abstract In this paper we propose the use of continuous residual modules for graph kernels in Graph Neural Networks. We show how both discrete and continuous residual layers allow for more robust training, being that continuous residual layers are those which are applied by integrating through an Ordinary Differential Equation (ODE) solver to produce their output. We experimentally show that these residuals achieve better results than the ones with non-residual modules when multiple layers are used, mitigating the low-pass filtering effect of GCN-based models. Finally, we apply and analyse the behaviour of these techniques and give pointers to how this technique can be useful in other domains by allowing more predictable behaviour under dynamic times of computation. 1 Introduction Graph Neural Networks (GNNs) are a promising framework to combine deep learning models and symbolic reasoning. Whereas conventional deep learning models, such as Convolutional Neural Networks (CNNs), effectively handle data represented in euclidean space, such as images, GNNs generalise their capabilities to handle non-Euclidean data, such as relational data with complex relationships and interdependencies between entities. Recently, deep learning techniques such as pooling, dynamic times of computation, attention, and adversarial training, which advanced the state-of-the-art in conventional deep learning (e.g. in CNNs), have been investigated in GNNs as well [1, 15, 26, 30]. Discrete residual modules, whose learned kernels are discrete derivatives over their inputs, have been proven effective to improve convergence and reduce the parameter space on CNNs, surpassing the state-of-the-art in image classification and other applications [11]. Given their effectiveness, the technique has been applied in many different areas and meta-models of deep learning to improve convergence and reduce the parameter space.


Independence Promoted Graph Disentangled Networks

arXiv.org Machine Learning

We address the problem of disentangled representation learning with independent latent factors in graph convolutional networks (GCNs). The current methods usually learn node representation by describing its neighborhood as a perceptual whole in a holistic manner while ignoring the entanglement of the latent factors. However, a real-world graph is formed by the complex interaction of many latent factors (e.g., the same hobby, education or work in social network). While little effort has been made toward exploring the disentangled representation in GCNs. In this paper, we propose a novel Independence Promoted Graph Disentangled Networks (IPGDN) to learn disentangled node representation while enhancing the independence among node representations. In particular, we firstly present disentangled representation learning by neighborhood routing mechanism, and then employ the Hilbert-Schmidt Independence Criterion (HSIC) to enforce independence between the latent representations, which is effectively integrated into a graph convolutional framework as a reg-ularizer at the output layer. Experimental studies on real-world graphs validate our model and demonstrate that our algorithms outperform the state-of-the-arts by a wide margin in different network applications, including semi-supervised graph classification, graph clustering and graph visualization.


Full Characterization of Parikh's Relevance-Sensitive Axiom for Belief Revision

Journal of Artificial Intelligence Research

In this article, the epistemic-entrenchment and partial-meet characterizations of Parikh's relevance-sensitive axiom for belief revision, known as axiom (P), are provided. In short, axiom (P) states that, if a belief set $K$ can be divided into two disjoint compartments, and the new information $\varphi$ relates only to the first compartment, then the revision of $K$ by $\varphi$ should not affect the second compartment. Accordingly, we identify the subclass of epistemic-entrenchment and that of selection-function preorders, inducing AGM revision functions that satisfy axiom (P). Hence, together with the faithful-preorders characterization of (P) that has already been provided, Parikh's axiom is fully characterized in terms of all popular constructive models of Belief Revision. Since the notions of relevance and local change are inherent in almost all intellectual activity, the completion of the constructive view of (P) has a significant impact on many theoretical, as well as applied, domains of Artificial Intelligence.


Fully Bayesian Recurrent Neural Networks for Safe Reinforcement Learning

arXiv.org Machine Learning

Reinforcement Learning (RL) has demonstrated state-of-the-art results in a number of autonomous system applications, however many of the underlying algorithms rely on black-box predictions. This results in poor explainability of the behaviour of these systems, raising concerns as to their use in safety-critical applications. Recent work has demonstrated that uncertainty-aware models exhibit more cautious behaviours through the incorporation of model uncertainty estimates. In this work, we build on Probabilistic Backpropagation to introduce a fully Bayesian Recurrent Neural Network architecture. We apply this within a Safe RL scenario, and demonstrate that the proposed method significantly outperforms a popular approach for obtaining model uncertainties in collision avoidance tasks. Furthermore, we demonstrate that the proposed approach requires less training and is far more efficient than the current leading method, both in terms of compute resource and memory footprint.


Network Intrusion Detection based on LSTM and Feature Embedding

arXiv.org Machine Learning

Growing number of network devices and services have led to increasing demand for protective measures as hackers launch attacks to paralyze or steal information from victim systems. Intrusion Detection System (IDS) is one of the essential elements of network perimeter security which detects the attacks by inspecting network traffic packets or operating system logs. While existing works demonstrated effectiveness of various machine learning techniques, only few of them utilized the time-series information of network traffic data. Also, categorical information has not been included in neural network based approaches. In this paper, we propose network intrusion detection models based on sequential information using long short-term memory (LSTM) network and categorical information using the embedding technique. We have experimented the models with UNSW-NB15, which is a comprehensive network traffic dataset. The experiment results confirm that the proposed method improve the performance, observing binary classification accuracy of 99.72\%.


Federated Learning for Ranking Browser History Suggestions

arXiv.org Machine Learning

Federated Learning is a new subfield of machine learning that allows fitting models without collecting the training data itself. Instead of sharing data, users collaboratively train a model by only sending weight updates to a server. To improve the ranking of suggestions in the Firefox URL bar, we make use of Federated Learning to train a model on user interactions in a privacy-preserving way. This trained model replaces a handcrafted heuristic, and our results show that users now type over half a character less to find what they are looking for. To be able to deploy our system to real users without degrading their experience during training, we design the optimization process to be robust. To this end, we use a variant of Rprop for optimization, and implement additional safeguards. By using a numerical gradient approximation technique, our system is able to optimize anything in Firefox that is currently based on handcrafted heuristics. Our paper shows that Federated Learning can be used successfully to train models in privacy-respecting ways.


Comprehensive decision-strategy space exploration for efficient territorial planning strategies

arXiv.org Artificial Intelligence

Comprehensive decision-strategy space exploration for efficient territorial planning strategies Olivier Billaud, 1, Maxence Soubeyrand, 1, Sandra Luque, 1 and Maxime Lenormand 1, † 1 TETIS, Univ Montpellier, AgroParisTech, Cirad, CNRS, Irstea, Montpellier, France Multi-Criteria Decision Analysis (MCDA) is a well-known decision support tool that can be used in a wide variety of contexts. It is particularly useful for territorial planning in situations where several actors with different, and sometimes contradictory, point of views have to take a decision regarding land use development. While the impact of the weights used to represent the relative importance of criteria has been widely studied in the recent literature, the impact of order weights determination have rarely been investigated. This paper presents a spatial sensitivity analysis to assess the impact of order weights determination in Multi-Criteria Analysis by Ordered Weighted Averaging. We propose a methodology based on an efficient exploration of the decision-strategy space defined by the level of risk and tradeoff in the decision process. We illustrate our approach with a land use planning process in the South of France. The objective is to find suitable areas for urban development while preserving green areas and their associated ecosystem services. The ecosystem service approach has indeed the potential to widen the scope of traditional landscape-ecological planning by including ecosystem-based benefits, including social and economic benefits, green infrastructures and biophysical parameters in urban and territorial planning. We show that in this particular case the decision-strategy space can be divided into four clusters. Each of them is associated with a map summarizing the average spatial suitability distribution used to identify potential areas for urban development.


Autoencoding undirected molecular graphs with neural networks

arXiv.org Machine Learning

We propose a machine learning model, inspired by language modeling from natural language processing, which can automatically correct molecules in discrete representations using a structure rule learned from a collection of undirected molecular graphs. Using discrete representations of molecules allows cheap, fast, and coarse grained insights. We introduce an adaption on a modern neural network architecture, the Transformer, which can learn relationships between atoms and bonds. The algorithm thereby solves the unsupervised task of recovering partially observed molecules represented as undirected graphs. This is to our knowledge, the first work that can automatically learn any discrete molecular structure rule with input exclusively consisting of a training set of molecules. In this work the neural network successfully approximates the octet rule, relations in hypervalent molecules and ions when trained on the ZINC and QM9 dataset. These results provides encouraging evidence that neural networks can learn advanced molecular structure rules and dataset specific properties, as the transformer surpasses a strong octet-rule baseline.