AITopics | Noiry, Nathan

Collaborating Authors

Noiry, Nathan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials

Perrin, Valentine, Noiry, Nathan, Loiseau, Nicolas, Nowak, Alex

arXiv.org Machine LearningJan-23-2024

Non-significant randomized control trials can hide subgroups of good responders to experimental drugs, thus hindering subsequent development. Identifying such heterogeneous treatment effects is key for precision medicine and many post-hoc analysis methods have been developed for that purpose. While several benchmarks have been carried out to identify the strengths and weaknesses of these methods, notably for binary and continuous endpoints, similar systematic empirical evaluation of subgroup analysis for time-to-event endpoints are lacking. This work aims to fill this gap by evaluating several subgroup analysis algorithms in the context of time-to-event outcomes, by means of three different research questions: Is there heterogeneity? What are the biomarkers responsible for such heterogeneity? Who are the good responders to treatment? In this context, we propose a new synthetic and semi-synthetic data generation process that allows one to explore a wide range of heterogeneity scenarios with precise control on the level of heterogeneity. We provide an open source Python package, available on Github, containing our generation process and our comprehensive benchmark framework. We hope this package will be useful to the research community for future investigations of heterogeneity of treatment effects and subgroup analysis methods benchmarking.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2401.11842

Country:

North America > United States (1.00)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification

Colombo, Pierre, Noiry, Nathan, Staerman, Guillaume, Piantanida, Pablo

arXiv.org Artificial IntelligenceOct-21-2023

One of the pursued objectives of deep learning is to provide tools that learn abstract representations of reality from the observation of multiple contextual situations. More precisely, one wishes to extract disentangled representations which are (i) low dimensional and (ii) whose components are independent and correspond to concepts capturing the essence of the objects under consideration (Locatello et al., 2019b). One step towards this ambitious project consists in learning disentangled representations with respect to a predefined (sensitive) attribute, e.g., the gender or age of the writer. Perhaps one of the main application for such disentangled representations is fair classification. Existing methods extract the last layer of a neural network trained with a loss that is composed of a cross-entropy objective and a disentanglement regularizer. In this work, we adopt an information-theoretic view of this problem which motivates a novel family of regularizers that minimizes the mutual information between the latent representation and the sensitive attribute conditional to the target. The resulting set of losses, called CLINIC, is parameter free and thus, it is easier and faster to train. CLINIC losses are studied through extensive numerical experiments by training over 2k neural networks. We demonstrate that our methods offer a better disentanglement/accuracy trade-off than previous techniques, and generalize better than training with cross-entropy loss solely provided that the disentanglement task is not too constraining.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2310.1399

Country:

Europe (0.92)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.81)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Toward Stronger Textual Attack Detectors

Colombo, Pierre, Picot, Marine, Noiry, Nathan, Staerman, Guillaume, Piantanida, Pablo

arXiv.org Artificial IntelligenceOct-21-2023

The landscape of available textual adversarial attacks keeps growing, posing severe threats and raising concerns regarding the deep NLP system's integrity. However, the crucial problem of defending against malicious attacks has only drawn the attention of the NLP community. The latter is nonetheless instrumental in developing robust and trustworthy systems. This paper makes two important contributions in this line of search: (i) we introduce LAROUSSE, a new framework to detect textual adversarial attacks and (ii) we introduce STAKEOUT, a new benchmark composed of nine popular attack methods, three datasets, and two pre-trained models. LAROUSSE is ready-to-use in production as it is unsupervised, hyperparameter-free, and non-differentiable, protecting it against gradient-based methods. Our new benchmark STAKEOUT allows for a robust evaluation framework: we conduct extensive numerical experiments which demonstrate that LAROUSSE outperforms previous methods, and which allows to identify interesting factors of detection rate variations.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.14001

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

Dadalto, Eduardo, Colombo, Pierre, Staerman, Guillaume, Noiry, Nathan, Piantanida, Pablo

arXiv.org Artificial IntelligenceJun-6-2023

A key feature of out-of-distribution (OOD) detection is to exploit a trained neural network by extracting statistical patterns and relationships through the multi-layer classifier to detect shifts in the expected input data distribution. Despite achieving solid results, several state-of-the-art methods rely on the penultimate or last layer outputs only, leaving behind valuable information for OOD detection. Methods that explore the multiple layers either require a special architecture or a supervised objective to do so. This work adopts an original approach based on a functional view of the network that exploits the sample's trajectories through the various layers and their statistical dependencies. It goes beyond multivariate features aggregation and introduces a baseline rooted in functional anomaly detection. In this new framework, OOD detection translates into detecting samples whose trajectories differ from the typical behavior characterized by the training set. We validate our method and empirically demonstrate its effectiveness in OOD detection compared to strong state-of-the-art baselines on computer vision benchmarks.

data mining, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.03522

Genre: Research Report > Promising Solution (0.68)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks

Himmi, Anas, Irurozki, Ekhine, Noiry, Nathan, Clemencon, Stephan, Colombo, Pierre

arXiv.org Artificial IntelligenceMay-17-2023

The evaluation of natural language processing (NLP) systems is crucial for advancing the field, but current benchmarking approaches often assume that all systems have scores available for all tasks, which is not always practical. In reality, several factors such as the cost of running baseline, private systems, computational limitations, or incomplete data may prevent some systems from being evaluated on entire tasks. This paper formalize an existing problem in NLP research: benchmarking when some systems scores are missing on the task, and proposes a novel approach to address it. Our method utilizes a compatible partial ranking approach to impute missing data, which is then aggregated using the Borda count method. It includes two refinements designed specifically for scenarios where either task-level or instance-level scores are available. We also introduce an extended benchmark, which contains over 131 million scores, an order of magnitude larger than existing benchmarks. We validate our methods and demonstrate their effectiveness in addressing the challenge of missing system evaluation on an entire task. This work highlights the need for more comprehensive benchmarking approaches that can handle real-world scenarios where not all systems are evaluated on the entire task.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.10284

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (0.46)
Government > Voting & Elections (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
(2 more...)

Add feedback

Beyond Mahalanobis-Based Scores for Textual OOD Detection

Colombo, Pierre, Gomes, Eduardo D. C., Staerman, Guillaume, Noiry, Nathan, Piantanida, Pablo

arXiv.org Artificial IntelligenceNov-24-2022

Deep learning methods have boosted the adoption of NLP systems in real-life applications. However, they turn out to be vulnerable to distribution shifts over time which may cause severe dysfunctions in production systems, urging practitioners to develop tools to detect out-of-distribution (OOD) samples through the lens of the neural network. In this paper, we introduce TRUSTED, a new OOD detector for classifiers based on Transformer architectures that meets operational requirements: it is unsupervised and fast to compute. The efficiency of TRUSTED relies on the fruitful idea that all hidden layers carry relevant information to detect OOD examples. Based on this, for a given input, TRUSTED consists in (i) aggregating this information and (ii) computing a similarity score by exploiting the training distribution, leveraging the powerful concept of data depth. Our extensive numerical experiments involve 51k model configurations, including various checkpoints, seeds, and datasets, and demonstrate that TRUSTED achieves state-of-the-art performances. In particular, it improves previous AUROC over 3 points.

artificial intelligence, machine learning, msp 0, (15 more...)

arXiv.org Artificial Intelligence

2211.13527

Country: North America > United States > Minnesota (0.27)

Genre: Research Report > New Finding (0.45)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

What are the best systems? New perspectives on NLP Benchmarking

Colombo, Pierre, Noiry, Nathan, Irurozki, Ekhine, Clemencon, Stephan

arXiv.org Artificial IntelligenceFeb-10-2022

In Machine Learning, a benchmark refers to an ensemble of datasets associated with one or multiple metrics together with a way to aggregate different systems performances. They are instrumental in (i) assessing the progress of new methods along different axes and (ii) selecting the best systems for practical use. This is particularly the case for NLP with the development of large pre-trained models (e.g. GPT, BERT) that are expected to generalize well on a variety of tasks. While the community mainly focused on developing new datasets and metrics, there has been little interest in the aggregation procedure, which is often reduced to a simple average over various performance measures. However, this procedure can be problematic when the metrics are on a different scale, which may lead to spurious conclusions. This paper proposes a new procedure to rank systems based on their performance across different tasks. Motivated by the social choice theory, the final system ordering is obtained through aggregating the rankings induced by each task and is theoretically grounded. We conduct extensive numerical experiments (on over 270k scores) to assess the soundness of our approach both on synthetic and real scores (e.g. GLUE, EXTREM, SEVAL, TAC, FLICKR). In particular, we show that our method yields different conclusions on state-of-the-art systems than the mean-aggregation procedure while being both more reliable and robust.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2202.03799

Country:

Europe (0.67)
North America > United States > Maryland (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.86)

Add feedback

Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics

Limnios, Myrto, Noiry, Nathan, Clémençon, Stéphan

arXiv.org Machine LearningSep-20-2021

The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are abnormal. Detecting the latter, defined as outliers, is one of the major challenges for machine learning applications (e.g. in fraud detection or in predictive maintenance). In this paper, we propose a methodology addressing the problem of outlier detection, by learning a data-driven scoring function defined on the feature space which reflects the degree of abnormality of the observations. This scoring function is learnt through a well-designed binary classification problem whose empirical criterion takes the form of a two-sample linear rank statistics on which theoretical results are available. We illustrate our methodology with preliminary encouraging numerical experiments.

artificial intelligence, mv curve, neural network, (15 more...)

arXiv.org Machine Learning

2109.0959

Country: Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.67)

Add feedback

Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm

Noiry, Nathan, Sentenac, Flore, Perchet, Vianney

arXiv.org Machine LearningJul-2-2021

Motivated by sequential budgeted allocation problems, we investigate online matching problems where connections between vertices are not i.i.d., but they have fixed degree distributions -- the so-called configuration model. We estimate the competitive ratio of the simplest algorithm, GREEDY, by approximating some relevant stochastic discrete processes by their continuous counterparts, that are solutions of an explicit system of partial differential equations. This technique gives precise bounds on the estimation errors, with arbitrarily high probability as the problem size increases. In particular, it allows the formal comparison between different configuration models. We also prove that, quite surprisingly, GREEDY can have better performance guarantees than RANKING, another celebrated algorithm for online matching that usually outperforms the former.

artificial intelligence, graph, vertex, (16 more...)

arXiv.org Machine Learning

2107.00995

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.41)

Add feedback