AITopics | oob

Collaborating Authors

oob

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimistic optimization of a Brownian

Jean-Bastien Grill, Michal Valko, Remi Munos

Neural Information Processing SystemsFeb-14-2026, 04:59:00 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, brownian motion, sample complexity, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Optimistic optimization of a Brownian

Jean-Bastien Grill, Michal Valko, Remi Munos

Neural Information Processing SystemsNov-20-2025, 19:21:41 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, brownian motion, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Bootstrapped Control Limits for Score-Based Concept Drift Control Charts

Wu, Jiezhong, Apley, Daniel W.

arXiv.org Machine LearningJul-23-2025

Monitoring for changes in a predictive relationship represented by a fitted supervised learning model (aka concept drift detection) is a widespread problem, e.g., for retrospective analysis to determine whether the predictive relationship was stable over the training data, for prospective analysis to determine when it is time to update the predictive model, for quality control of processes whose behavior can be characterized by a predictive relationship, etc. A general and powerful Fisher score-based concept drift approach has recently been proposed, in which concept drift detection reduces to detecting changes in the mean of the model's score vector using a multivariate exponentially weighted moving average (MEWMA). To implement the approach, the initial data must be split into two subsets. The first subset serves as the training sample to which the model is fit, and the second subset serves as an out-of-sample test set from which the MEWMA control limit (CL) is determined. In this paper, we develop a novel bootstrap procedure for computing the CL. Our bootstrap CL provides much more accurate control of false-alarm rate, especially when the sample size and/or false-alarm rate is small. It also allows the entire initial sample to be used for training, resulting in a more accurate fitted supervised learning model. We show that a standard nested bootstrap (inner loop accounting for future data variability and outer loop accounting for training sample variability) substantially underestimates variability and develop a 632-like correction that appropriately accounts for this. We demonstrate the advantages with numerical examples.

artificial intelligence, machine learning, predictive relationship, (17 more...)

arXiv.org Machine Learning

2507.16749

Country: North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A Central Limit Theorem for the permutation importance measure

Föge, Nico, Schmid, Lena, Ditzhaus, Marc, Pauly, Markus

arXiv.org Machine LearningDec-17-2024

Random Forests have become a widely used tool in machine learning since their introduction in 2001, known for their strong performance in classification and regression tasks. One key feature of Random Forests is the Random Forest Permutation Importance Measure (RFPIM), an internal, non-parametric measure of variable importance. While widely used, theoretical work on RFPIM is sparse, and most research has focused on empirical findings. However, recent progress has been made, such as establishing consistency of the RFPIM, although a mathematical analysis of its asymptotic distribution is still missing. In this paper, we provide a formal proof of a Central Limit Theorem for RFPIM using U-Statistics theory. Our approach deviates from the conventional Random Forest model by assuming a random number of trees and imposing conditions on the regression functions and error terms, which must be bounded and additive, respectively. Our result aims at improving the theoretical understanding of RFPIM rather than conducting comprehensive hypothesis testing. However, our contributions provide a solid foundation and demonstrate the potential for future work to extend to practical applications which we also highlight with a small simulation study.

artificial intelligence, central limit theorem, machine learning, (16 more...)

arXiv.org Machine Learning

2412.1302

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Topological data analysis of human vowels: Persistent homologies across representation spaces

Bonafos, Guillem, Freyermuth, Jean-Marc, Pudlo, Pierre, Tronçon, Samuel, Rey, Arnaud

arXiv.org Machine LearningOct-10-2023

Topological Data Analysis (TDA) has been successfully used for various tasks in signal/image processing, from visualization to supervised/unsupervised classification. Often, topological characteristics are obtained from persistent homology theory. The standard TDA pipeline starts from the raw signal data or a representation of it. Then, it consists in building a multiscale topological structure on the top of the data using a pre-specified filtration, and finally to compute the topological signature to be further exploited. The commonly used topological signature is a persistent diagram (or transformations of it). Current research discusses the consequences of the many ways to exploit topological signatures, much less often the choice of the filtration, but to the best of our knowledge, the choice of the representation of a signal has not been the subject of any study yet. This paper attempts to provide some answers on the latter problem. To this end, we collected real audio data and built a comparative study to assess the quality of the discriminant information of the topological signatures extracted from three different representation spaces. Each audio signal is represented as i) an embedding of observed data in a higher dimensional space using Taken's representation, ii) a spectrogram viewed as a surface in a 3D ambient space, iii) the set of spectrogram's zeroes. From vowel audio recordings, we use topological signature for three prediction problems: speaker gender, vowel type, and individual. We show that topologically-augmented random forest improves the Out-of-Bag Error (OOB) over solely based Mel-Frequency Cepstral Coefficients (MFCC) for the last two problems. Our results also suggest that the topological information extracted from different signal representations is complementary, and that spectrogram's zeros offers the best improvement for gender prediction.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Machine Learning

2310.06508

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Media (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)

Add feedback

MDA for random forests: inconsistency, and a practical solution via the Sobol-MDA

Bénard, Clément, da Veiga, Sébastien, Scornet, Erwan

arXiv.org Machine LearningFeb-26-2021

Variable importance measures are the main tools to analyze the black-box mechanism of random forests. Although the Mean Decrease Accuracy (MDA) is widely accepted as the most efficient variable importance measure for random forests, little is known about its theoretical properties. In fact, the exact MDA definition varies across the main random forest software. In this article, our objective is to rigorously analyze the behavior of the main MDA implementations. Consequently, we mathematically formalize the various implemented MDA algorithms, and then establish their limits when the sample size increases. In particular, we break down these limits in three components: the first two are related to Sobol indices, which are well-defined measures of a variable contribution to the output variance, widely used in the sensitivity analysis field, as opposed to the third term, whose value increases with dependence within input variables. Thus, we theoretically demonstrate that the MDA does not target the right quantity when inputs are dependent, a fact that has already been noticed experimentally. To address this issue, we define a new importance measure for random forests, the Sobol-MDA, which fixes the flaws of the original MDA. We prove the consistency of the Sobol-MDA and show its good empirical performance through experiments on both simulated and real data. An open source implementation in R and C++ is available online.

assumption, mda, random forest, (16 more...)

arXiv.org Machine Learning

2102.13347

Country:

North America > United States > New York (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Genre:

Overview (0.67)
Research Report (0.49)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Transportation > Air (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

From unbiased MDI Feature Importance to Explainable AI for Trees

Loecher, Markus

arXiv.org Machine LearningMar-26-2020

We attempt to give a unifying view of the various recent attempts to (i) improve the interpretability of tree-based models and (ii) debias the the default variable-importance measure in random Forests, Gini importance. In particular, we demonstrate a common thread among the out-of-bag based bias correction methods and their connection to local explanation for trees. In addition, we point out a bias caused by the inclusion of inbag data in the newly developed explainable AI for trees algorithms.

contribution, feature contribution, oob, (12 more...)

arXiv.org Machine Learning

2003.12043

Country:

Europe > Germany > Berlin (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.61)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.61)

Add feedback

Unbiased variable importance for random forests

Loecher, Markus

arXiv.org Machine LearningMar-4-2020

The default variable-importance measure in random Forests, Gini importance, has been shown to suffer from the bias of the underlying Gini-gain splitting criterion. While the alternative permutation importance is generally accepted as a reliable measure of variable importance, it is also computationally demanding and suffers from other shortcomings. We propose a simple solution to the misleading/untrustworthy Gini importance which can be viewed as an overfitting problem: we compute the loss reduction on the out-of-bag instead of the in-bag training samples.

importance measure, impurity, oob, (15 more...)

arXiv.org Machine Learning

2003.02106

Country: Europe > Germany > Berlin (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight

Blukis, Valts, Terme, Yannick, Niklasson, Eyvind, Knepper, Ross A., Artzi, Yoav

arXiv.org Artificial IntelligenceOct-21-2019

Abstract: We propose a joint simulation and real-world learning framework for mapping navigation instructions and raw first-person observations to continuous control. Our model estimates the need for environment exploration, predicts the likelihood of visiting environment positions during execution, and controls the agent to both explore and visit high-likelihood positions. We introduce Supervised Reinforcement Asynchronous Learning (SuReAL). Learning uses both simulation and real environments without requiring autonomous flight in the physical environment during training, and combines supervised learning for predicting positions to visit and reinforcement learning for continuous control. We evaluate our approach on a natural language instruction-following task with a physical quad-copter, and demonstrate effective execution and exploration behavior.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1910.09664

Country:

North America > United States (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Air (0.34)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Optimistic optimization of a Brownian

Grill, Jean-Bastien, Valko, Michal, Munos, Rémi

arXiv.org Machine LearningJan-15-2019

We address the problem of optimizing a Brownian motion. We consider a (random) realization $W$ of a Brownian motion with input space in $[0,1]$. Given $W$, our goal is to return an $\epsilon$-approximation of its maximum using the smallest possible number of function evaluations, the sample complexity of the algorithm. We provide an algorithm with sample complexity of order $\log^2(1/\epsilon)$. This improves over previous results of Al-Mharmah and Calvin (1996) and Calvin et al. (2017) which provided only polynomial rates. Our algorithm is adaptive---each query depends on previous values---and is an instance of the optimism-in-the-face-of-uncertainty principle.

algorithm, brownian motion, sample complexity, (15 more...)

arXiv.org Machine Learning

1901.04884

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback