Kusters, Remy
Neural-based classification rule learning for sequential data
Collery, Marine, Bonnard, Philippe, Fages, François, Kusters, Remy
Discovering interpretable patterns for classification of sequential data is of key importance for a variety of fields, ranging from genomics to fraud detection or more generally interpretable decision-making. In this paper, we propose a novel differentiable fully interpretable method to discover both local and global patterns (i.e. It consists of a convolutional binary neural network with an interpretable neural filter and a training strategy based on dynamically-enforced sparsity. We demonstrate the validity and usefulness of the approach on synthetic datasets and on an open-source peptides dataset. Key to this end-to-end differentiable method is that the expressive patterns used in the rules are learned alongside the rules themselves. During the last decades, machine learning and in particular neural networks have made tremendous progress on classification tasks for a variety of fields such as healthcare, fraud detection or entertainment. They are able to learn from various data types ranging from images to timeseries and achieve impressive classification accuracy. However, they are difficult or impossible to understand by a human.
Differentiable Rule Induction with Learned Relational Features
Kusters, Remy, Kim, Yusik, Collery, Marine, Marie, Christian de Sainte, Gupta, Shubham
Rule-based decision models are attractive due to their interpretability. However, existing rule induction methods often results in long and consequently less interpretable set of rules. This problem can, in many cases, be attributed to the rule learner's lack of appropriately expressive vocabulary, i.e., relevant predicates. Most existing rule induction algorithms presume the availability of predicates used to represent the rules, naturally decoupling the predicate definition and the rule learning phases. In contrast, we propose the Relational Rule Network (RRN), a neural architecture that learns relational predicates that represent a linear relationship among attributes along with the rules that use them. This approach opens the door to increasing the expressiveness of induced decision models by coupling predicate learning directly with rule learning in an end to end differentiable fashion. On benchmark tasks, we show that these relational predicates are simple enough to retain interpretability, yet improve prediction accuracy and provide sets of rules that are more concise compared to state of the art rule induction algorithms.
Discovering PDEs from Multiple Experiments
Tod, Georges, Both, Gert-Jan, Kusters, Remy
Automated model discovery of partial differential equations (PDEs) usually considers a single experiment or dataset to infer the underlying governing equations. In practice, experiments have inherent natural variability in parameters, initial and boundary conditions that cannot be simply averaged out. We introduce a randomised adaptive group Lasso sparsity estimator to promote grouped sparsity and implement it in a deep learning based PDE discovery framework. It allows to create a learning bias that implies the a priori assumption that all experiments can be explained by the same underlying PDE terms with potentially different coefficients. Our experimental results show more generalizable PDEs can be found from multiple highly noisy datasets, by this grouped sparsity promotion rather than simply performing independent model discoveries.
Sparsistent Model Discovery
Tod, Georges, Both, Gert-Jan, Kusters, Remy
Discovering the partial differential equations underlying a spatio-temporal datasets from very limited observations is of paramount interest in many scientific fields. However, it remains an open question to know when model discovery algorithms based on sparse regression can actually recover the underlying physical processes. We trace back the poor of performance of Lasso based model discovery algorithms to its potential variable selection inconsistency: meaning that even if the true model is present in the library, it might not be selected. By first revisiting the irrepresentability condition (IRC) of the Lasso, we gain some insights of when this might occur. We then show that the adaptive Lasso will have more chances of verifying the IRC than the Lasso and propose to integrate it within a deep learning model discovery framework with stability selection and error control. Experimental results show we can recover several nonlinear and chaotic canonical PDEs with a single set of hyperparameters from a very limited number of samples at high noise levels.
Fully differentiable model discovery
Both, Gert-Jan, Kusters, Remy
Model discovery aims at autonomously discovering differential equations underlying a dataset. Approaches based on Physics Informed Neural Networks (PINNs) have shown great promise, but a fully-differentiable model which explicitly learns the equation has remained elusive. In this paper we propose such an approach by combining neural network based surrogates with Sparse Bayesian Learning (SBL). We start by reinterpreting PINNs as multitask models, applying multitask learning using uncertainty, and show that this leads to a natural framework for including Bayesian regression techniques. We then construct a robust model discovery algorithm by using SBL, which we showcase on various datasets. Concurrently, the multitask approach allows the use of probabilistic approximators, and we show a proof of concept using normalizing flows to directly learn a density model from single particle data. Our work expands PINNs to various types of neural network architectures, and connects neural network-based surrogates to the rich field of Bayesian parameter inference.
Model discovery in the sparse sampling regime
Both, Gert-Jan, Tod, Georges, Kusters, Remy
To improve the physical understanding and the predictions of complex dynamic systems, such as ocean dynamics and weather predictions, it is of paramount interest to identify interpretable models from coarsely and off-grid sampled observations. In this work we investigate how deep learning can improve model discovery of partial differential equations when the spacing between sensors is large and the samples are not placed on a grid. We show how leveraging physics informed neural network interpolation and automatic differentiation, allow to better fit the data and its spatiotemporal derivatives, compared to more classic spline interpolation and numerical differentiation techniques. As a result, deep learning based model discovery allows to recover the underlying equations, even when sensors are placed further apart than the data's characteristic length scale and in the presence of high noise levels. We illustrate our claims on both synthetic and experimental data sets where combinations of physical processes such as (non)-linear advection, reaction and diffusion are correctly identified. Mathematical models are central in modelling complex dynamical processes such as climate change, the spread of an epidemic or to design aircrafts.
DeepMoD: Deep learning for Model Discovery in noisy data
Both, Gert-Jan, Choudhury, Subham, Sens, Pierre, Kusters, Remy
Institut Curie, PSL Research University, CNRS UMR168, Paris, France We introduce DeepMoD, a deep learning based model discovery algorithm which seeks the partial differential equation underlying a spatiotemporal data set. DeepMoD employs sparse regression on a library of basis functions and their corresponding spatial derivatives. A feed-forward neural network approximates the data set and automatic differentiation is used to construct this function library and perform regression within the neural network. This construction makes it extremely robust to noise and applicable to small data sets and, contrary to other deep learning methods, does not require a training set and is impervious to overfitting. We illustrate this approach on several physical problems, such as the Burgers', Korteweg-de Vries, advection-diffusion and Keller-Segel equations, and This resilience to noise and high performance at very few samples highlights the potential of this method to be applied on experimental data. The increasing ability to generate large amounts of model discovery turns into finding a sparse representation data from complex physical, biological, chemical and social of the coefficient vector ξ. Rudy et al. [3] introduce the systems is beginning to transform quantitative science.regression