Lemaire, Vincent
Deep Reinforcement Learning based Triggering Function for Early Classifiers of Time Series
Renault, Aurélien, Bondu, Alexis, Cornuéjols, Antoine, Lemaire, Vincent
Early Classification of Time Series (ECTS) has been recognized as an important problem in many areas where decisions have to be taken as soon as possible, before the full data availability, while time pressure increases. Numerous ECTS approaches have been proposed, based on different triggering functions, each taking into account various pieces of information related to the incoming time series and/or the output of a classifier. Although their performances have been empirically compared in the literature, no studies have been carried out on the optimality of these triggering functions that involve ``man-tailored'' decision rules. Based on the same information, could there be better triggering functions? This paper presents one way to investigate this question by showing first how to translate ECTS problems into Reinforcement Learning (RL) ones, where the very same information is used in the state space. A thorough comparison of the performance obtained by ``handmade'' approaches and their ``RL-based'' counterparts has been carried out. A second question investigated in this paper is whether a different combination of information, defining the state space in RL systems, can achieve even better performance. Experiments show that the system we describe, called \textsc{Alert}, significantly outperforms its state-of-the-art competitors on a large number of datasets.
A new Input Convex Neural Network with application to options pricing
Lemaire, Vincent, Pagès, Gilles, Yeo, Christian
We introduce a new class of neural networks designed to be convex functions of their inputs, leveraging the principle that any convex function can be represented as the supremum of the affine functions it dominates. These neural networks, inherently convex with respect to their inputs, are particularly well-suited for approximating the prices of options with convex payoffs. We detail the architecture of this, and establish theoretical convergence bounds that validate its approximation capabilities. We also introduce a \emph{scrambling} phase to improve the training of these networks. Finally, we demonstrate numerically the effectiveness of these networks in estimating prices for three types of options with convex payoffs: Basket, Bermudan, and Swing options.
Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark
George, Thomas, Nodet, Pierre, Bondu, Alexis, Lemaire, Vincent
Mislabeled examples are ubiquitous in real-world machine learning datasets, advocating the development of techniques for automatic detection. We show that most mislabeled detection methods can be viewed as probing trained machine learning models using a few core principles. We formalize a modular framework that encompasses these methods, parameterized by only 4 building blocks, as well as a Python library that demonstrates that these principles can actually be implemented. The focus is on classifier-agnostic concepts, with an emphasis on adapting methods developed for deep learning models to non-deep classifiers for tabular data. We benchmark existing methods on (artificial) Completely At Random (NCAR) as well as (realistic) Not At Random (NNAR) labeling noise from a variety of tasks with imperfect labeling rules. This benchmark provides new insights as well as limitations of existing methods in this setup.
Constructing Variables Using Classifiers as an Aid to Regression: An Empirical Assessment
Troisemaine, Colin, Lemaire, Vincent
This paper proposes a method for the automatic creation of variables (in the case of regression) that complement the information contained in the initial input vector. The method works as a pre-processing step in which the continuous values of the variable to be regressed are discretized into a set of intervals which are then used to define value thresholds. Then classifiers are trained to predict whether the value to be regressed is less than or equal to each of these thresholds. The different outputs of the classifiers are then concatenated in the form of an additional vector of variables that enriches the initial vector of the regression problem. The implemented system can thus be considered as a generic pre-processing tool. We tested the proposed enrichment method with 5 types of regressors and evaluated it in 33 regression datasets. Our experimental results confirm the interest of the approach.
An analysis of the noise schedule for score-based generative models
Strasman, Stanislas, Ocello, Antonio, Boyer, Claire, Corff, Sylvain Le, Lemaire, Vincent
Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. All existing results have been obtained so far for time-homogeneous speed of the noise schedule. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Assuming that the score is Lipschitz continuous, we provide an improved error bound in Wasserstein distance, taking advantage of favourable underlying contraction mechanisms. We also propose an algorithm to automatically tune the noise schedule using the proposed upper bound. We illustrate empirically the performance of the noise schedule optimization in comparison to standard choices in the literature.
Viewing the process of generating counterfactuals as a source of knowledge
Lemaire, Vincent, Boudec, Nathan Le, Guyomard, Victor, Fessant, Françoise
There are now many explainable AI methods for understanding the decisions of a machine learning model. Among these are those based on counterfactual reasoning, which involve simulating features changes and observing the impact on the prediction. This article proposes to view this simulation process as a source of creating a certain amount of knowledge that can be stored to be used, later, in different ways. This process is illustrated in the additive model and, more specifically, in the case of the naive Bayes classifier, whose interesting properties for this purpose are shown.
A Practical Approach to Novel Class Discovery in Tabular Data
Troisemaine, Colin, Reiffers-Masson, Alexandre, Gosselin, Stéphane, Lemaire, Vincent, Vaton, Sandrine
The problem of Novel Class Discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the $k$-fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and performs impressively well under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms ($k$-means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
Evidential uncertainties on rich labels for active learning
Hoarau, Arthur, Lemaire, Vincent, Martin, Arnaud, Dubois, Jean-Christophe, Gall, Yolande Le
Recent research in active learning, and more precisely in uncertainty sampling, has focused on the decomposition of model uncertainty into reducible and irreducible uncertainties. In this paper, we propose to simplify the computational phase and remove the dependence on observations, but more importantly to take into account the uncertainty already present in the labels, \emph{i.e.} the uncertainty of the oracles. Two strategies are proposed, sampling by Klir uncertainty, which addresses the exploration-exploitation problem, and sampling by evidential epistemic uncertainty, which extends the reducible uncertainty to the evidential framework, both using the theory of belief functions.
biquality-learn: a Python library for Biquality Learning
Nodet, Pierre, Lemaire, Vincent, Bondu, Alexis, Cornuéjols, Antoine
The democratization of Data Mining has been widely successful thanks in part to powerful and easy-to-use Machine Learning libraries. These libraries have been particularly tailored to tackle Supervised Learning. However, strong supervision signals are scarce in practice, and practitioners must resort to weak supervision. In addition to weaknesses of supervision, dataset shifts are another kind of phenomenon that occurs when deploying machine learning models in the real world. That is why Biquality Learning has been proposed as a machine learning framework to design algorithms capable of handling multiple weaknesses of supervision and dataset shifts without assumptions on their nature and level by relying on the availability of a small trusted dataset composed of cleanly labeled and representative samples. Thus we propose biquality-learn: a Python library for Biquality Learning with an intuitive and consistent API to learn machine learning models from biquality data, with well-proven algorithms, accessible and easy to use for everyone, and enabling researchers to experiment in a reproducible way on biquality data.
Automatic Feature Engineering for Time Series Classification: Evaluation and Discussion
Renault, Aurélien, Bondu, Alexis, Lemaire, Vincent, Gay, Dominique
Time Series Classification (TSC) has received much attention in the past two decades and is still a crucial and challenging problem in data science and knowledge engineering. Indeed, along with the increasing availability of time series data, many TSC algorithms have been suggested by the research community in the literature. Besides state-of-the-art methods based on similarity measures, intervals, shapelets, dictionaries, deep learning methods or hybrid ensemble methods, several tools for extracting unsupervised informative summary statistics, aka features, from time series have been designed in the recent years. Originally designed for descriptive analysis and visualization of time series with informative and interpretable features, very few of these feature engineering tools have been benchmarked for TSC problems and compared with state-of-the-art TSC algorithms in terms of predictive performance. In this article, we aim at filling this gap and propose a simple TSC process to evaluate the potential predictive performance of the feature sets obtained with existing feature engineering tools. Thus, we present an empirical study of 11 feature engineering tools branched with 9 supervised classifiers over 112 time series data sets. The analysis of the results of more than 10000 learning experiments indicate that feature-based methods perform as accurately as current state-of-the-art TSC algorithms, and thus should rightfully be considered further in the TSC literature.