AITopics | identification

Collaborating Authors

identification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Language Generation in the Limit with Bounded Memory

Kleinberg, Jon, Mehrotra, Anay, Saberi, Amin, Velegkas, Grigoris

arXiv.org Machine LearningMay-29-2026

We study language generation in the limit under bounded memory. In this task, a learner observes examples from an unknown target language one at a time and must eventually output only new valid examples. Prior work assumes access to the entire history, a strong assumption since realistic algorithms retain limited past information. Classical work in learning theory shows memory constraints dramatically alter learnability; we extend this to language generation. First, we study memoryless generators. Under a mild enumeration restriction, every countable collection of infinite languages remains generable without memory. Without this restriction, we exactly characterize when memoryless generation is possible. For finite collections, we characterize the optimal minimax density achievable by memoryless generators -- the best density guaranteed against any collection of a given size. This combinatorial bound relies on Sperner's theorem and symmetric chain decompositions. We further show that a sliding window of the last $W$ examples does not improve this worst-case density, whereas allowing it to store $b$ adaptively chosen past examples improves the achievable density for every $b \geq 1$. Finally, we revisit identification in the limit, where the learner must converge to a single correct hypothesis for the target language. We focus on its incremental variant, where the learner remembers only its previous guess. Here, although exact identification fails on a collection of just three languages, a mild relaxation requiring convergence to an ``approximate'' version of the target is achievable for every finite collection. These results show bounded memory affects these tasks differently: generation remains achievable for every countable collection, while density and identification are confined to finite collections, with guarantees weakening as the collection grows.

generator, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2605.30324

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.81)

Add feedback

Causality as the Statistical Conscience of Artificial Intelligence: From Pearl's Ladder to Trustworthy Machines

Fokoué, Ernest

arXiv.org Machine LearningMay-26-2026

Modern Artificial Intelligence achieves remarkable predictive power by optimizing statistical risk functionals over vast corpora. Yet a gap separates this from genuine intelligence: the inability to distinguish correlation from causation. This paper argues that causal inference (identifying mechanisms invariant under intervention) is AI's indispensable statistical conscience. Without causal grounding, AI systems are correlation machines: powerful in familiar domains, brittle under distribution shift, and biased in high-stakes settings. Three contributions develop this argument. First, a Statistical Necessity Theorem for Causal Generalization: any algorithm achieving out-of-distribution generalization must encode causal structure, formalizing the distinction between prediction P(Y|X) and intelligence P(Y|do(X)). Second, a unified framework connects Pearl's do-calculus, the Potential Outcomes framework, Double Machine Learning, and Invariant Risk Minimization as a family of Causal Statistical Estimators, each identifying interventional distributions under different assumptions. Third, three AI failure modes (hallucination in large language models, reward hacking in reinforcement learning from human feedback, and degradation under distribution shift) are manifestations of causal blindness, each admitting a principled statistical remedy. Trustworthy AI is, at its core, a problem of causal statistics. The statistical community is not merely equipped to solve it -- it is the only community with the foundational tools to do so rigorously.

large language model, machine learning, reinforcement learning, (21 more...)

arXiv.org Machine Learning

2605.24076

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Mode-Shape Expansion Using Physics-Constrained Gaussian Process Regression

Ghahari, Farid

arXiv.org Machine LearningMay-25-2026

This paper addresses the challenge of reconstructing full-field structural mode shapes from sparse sensor data. While Gaussian Process Regression (GPR) offers a robust non-parametric framework for spatial interpolation and uncertainty quantification, standard formulations often yield physically inconsistent mode-shape reconstructions under sparse sensing conditions. A Physics-Constrained Single-Output Gaussian Process (CONS-SOGP) framework is derived that utilizes independent modal kernels while coupling the optimization via a mass-orthogonality penalty. The paper presents derivations for the marginal likelihood, hyperparameter gradients, and penalty coupling. Numerical verification on a multi-degree-of-freedom structure demonstrates that the proposed method overcomes existing limitations in GP-based prediction, providing more accurate and reliable expanded mode shapes.

artificial intelligence, mode shape, modeling & simulation, (16 more...)

arXiv.org Machine Learning

2605.23101

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Modeling & Simulation (0.91)

Add feedback

The Sample Complexity of Multiple Change Point Identification under Bandit Feedback

Graf, Maximilian, Thuot, Victor

arXiv.org Machine LearningMay-14-2026

We study multiple change point localization under bandit feedback. An unknown piecewise-constant function on a compact interval can be queried sequentially at adaptively chosen inputs, and each query returns a noisy evaluation of the function. The goal is to identify a prescribed number of discontinuities, known as change points, within a target precision $η$ and confidence level $1-δ$, while using as few samples as possible. We propose an adaptive algorithm that first detects intervals likely to contain change points and then refines their locations to precision $η$. We establish non-asymptotic upper bounds on its sample budget, together with corresponding lower bounds. Prior work shows that jump magnitudes alone determine the asymptotic sample complexity as $δ\to 0$. We reveal that this picture is incomplete beyond this regime. We demonstrate, both empirically and theoretically, that for general $δ$ and $η$, the complexity is jointly governed by the jumps and the relative positions of the change points.

artificial intelligence, change point, machine learning, (17 more...)

arXiv.org Machine Learning

2605.13252

Country: Europe (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

$\varepsilon$-Good Action Identification in Fixed-Budget Monte Carlo Tree Search

Li, Yinan, Nguyen, Tuan, Jun, Kwang-Sung

arXiv.org Machine LearningMay-13-2026

We study the fixed-budget max-min action identification problem in depth-2 max-min trees, an important special case of Monte Carlo Tree Search. A learner sequentially allocates $T$ samples to leaves and then recommends a subtree whose minimum leaf value is largest. Motivated by approximate planning, we focus on $\varepsilon$-good subtree identification, where any subtree whose min value is within $\varepsilon$ of the optimal maximin value is acceptable. Our main contribution is an $\varepsilon$-agnostic algorithm: it does not require $\varepsilon$ as input, but achieves instance-dependent error bounds for every meaningful $\varepsilon$. We show that the misidentification probability decays as $\exp(-\widetildeΘ(T/H_2(\varepsilon)))$, where $H_2(\varepsilon)$ captures both cross-subtree and within-subtree gaps. When each subtree has a single leaf, the problem reduces to standard fixed-budget best-arm identification, and our analysis recovers, up to accelerating factors, known $\varepsilon$-good guarantees for halving-style methods while giving a new $\varepsilon$-good guarantee for Successive Rejects. On the lower-bound side, we provide complementary positive and negative results showing that max-min identification has a different hardness structure from standard $K$-armed bandits. To our knowledge, this is the first provable fixed-budget algorithmic guarantee for max-min action identification.

artificial intelligence, planning & scheduling, subtree, (17 more...)

arXiv.org Machine Learning

2605.11324

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Proximal Path-Specific Inference

Bai, Yang, Wu, Sihan, Sun, Baoluo, Cui, Yifan

arXiv.org Machine LearningMay-12-2026

Mediation analysis (Robins & Greenland 1992, Pearl 2001, Imai, Keele & Tingley 2010, Tchetgen Tchetgen & Shpitser 2012) provides a principled framework for investigating causal mechanisms by decomposing the effect of a treatment A on an outcome Y into pathways operating through a mediator of interest M. Classical mediation analysis focuses on the natural indirect effect, corresponding to the pathway from Ato Y through M, and the natural direct effect, corresponding to pathways not through M. These estimands are well understood when a single mediator is present and strong identification assumptions hold. However, in many applications, there exist multiple intermediate variables between treatment and outcome. In such settings, conventional mediation analysis typically requires the absence of treatment-induced mediator-outcome confounders--often referred to as recanting witnesses--as well as the absence of unmeasured confounding. Under these circumstances, commonly used identification assumptions such as sequential ignorability (Imai, Keele & Yamamoto 2010) or nonparametric structural equation models with independent errors (NPSEM-IE) (Pearl 2009) no longer suffice to identify natural indirect effects (Avin et al. 2005, Tchetgen Tchetgen & VanderWeele 2014). Figure 1 illustrates this issue: the recanting witness D is directly affected by A and simultaneously confounds the relationship between M and Y. Such treatment-induced confounding is common in epidemiologic studies, particularly when the mediator of interest occurs long after the treatment initiation (Robins 1999). A motivating example arises in studies of preterm birth. Mediation analysis has been widely used to explore whether adequate prenatal care (A) reduces the risk of preterm birth (Y) through preeclampsia (M) (Vansteelandt & VanderWeele 2012, VanderWeele et al. 2014, Xia & Chan 2023).

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2605.09462

Country: North America > United States > California (0.28)

Genre:

Research Report > Strength Medium (0.48)
Research Report > Observational Study (0.48)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Epidemiology (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.68)

Add feedback

Optimal Experiments for Partial Causal Effect Identification

Maringgele, Tobias, Etesami, Jalal

arXiv.org Machine LearningMay-11-2026

Causal queries are often only partially identifiable from observational data, and experiments that could tighten the resulting bounds are typically costly. We study the problem of selecting, prior to observing experimental outcomes, a cost-constrained subset of experiments that maximally tightens bounds on a target query. We formalize this as the max-potency problem, where epistemic potency measures the worst-case reduction in bound width guaranteed by an experiment, and show that this problem is NP-hard via a reduction from 0-1 knapsack. Building on the polynomial-programming framework of Duarte et al. (2023), we give a general procedure for evaluating epistemic potency in discrete settings. To control the super-exponential search space, we introduce two graphical pruning criteria that depend only on the causal graph and the query: a novel path-interception rule that exploits district structure to certify zero potency in linear time, and an identifiability check based on the ID algorithm. On Erdos-Renyi random graphs and 11 bnlearn benchmark networks, the two criteria together prune 50-88% of candidate experiments on average without solving a single polynomial program. For the general subset search, we show that ID-pruned experiments are combinatorially inert, yielding a super-exponential reduction in the number of subsets evaluated. We close with an end-to-end demonstration on observational NHANES data, selecting optimal experiments for estimating the effect of physical activity on diabetes.

artificial intelligence, experiment, machine learning, (18 more...)

arXiv.org Machine Learning

2605.06993

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

Symbolic Regression via Neural Networks

Boddupalli, Nibodh, Matchen, Timothy, Moehlis, Jeff

arXiv.org Machine LearningMay-7-2026

Machine learning - specifically deep learning - techniques have shown their capabilities in approximating dynamics from data, but a shortcoming of traditional deep learning is that there is little insight into the underlying mapping beyond its numerical output for a given input. This limits their utility in analysis beyond simple prediction. Simultaneously, a number of strategies exist which identify models based on a fixed dictionary of basis functions, but most either require some intuition or insight about the system, or are susceptible to overfitting or a lack of parsimony. Here we present a novel approach that combines the flexibility and accuracy of deep learning approaches with the utility of symbolic solutions: a deep neural network that generates a symbolic expression for the governing equations. We first describe the architecture for our model, then show the accuracy of our algorithm across a range of classical dynamical systems. The dynamics of quantities of interest are widely modeled A number of authors have approached system identificaas differential equations, often derived from first princi-tion by fitting coefficients of a linear combination of basis 3ples. However, this is not always possible, especially whenfunctions, dating at least back to Crutchfield and McNamara . The The set of basis functions typically includes nonlinear terms, identification of models from data has seen significant ad-for example terms which would arise in a Taylor series exvances with the advent of machine learning. While deeppansion about the origin of the system3-6 or a broader class neural networks have enabled sufficient accuracy in fore-of functions7. The coefficients of the basis functions are decasting dynamic data with unprecedented versatility, thetermined through comparison of the original data points with models they represent lack closed-form expressions thatpoints from computed solutions to the fitted models. Varican be conducive to interpretation and analysis.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Machine Learning

doi: 10.1063/5.0134464

2605.04337

Country: North America > United States > California (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

Jin, Jikai, Syrgkanis, Vasilis

arXiv.org Machine LearningMay-5-2026

Offline evaluation of language models from usage logs is biased when model choice is confounded: the same user-side factors that influence which model is used can also influence how its output is judged, so raw comparisons of logged scores mix self-selected populations rather than estimating a common quantity of interest. A small randomized experiment can break this bias by overriding model choice, but in practice such experiments are scarce and costly. We study a three-source design that combines a large confounded observational log (OBS) for scale, a small randomized experiment (EXP) for unconfounded scoring, and an offline simulator (SIM) that replays candidate models on cached contexts. Our main result is an identification theorem showing that the randomized experiment and the simulator are together enough to recover causal model values; the observational log enters only afterward, to reduce estimation error rather than to make the causal comparison valid. Six estimator families are evaluated in a controlled semi-synthetic validation and in two real-task cached benchmarks for summarization and coding. No family dominates every regime; relative performance depends on the amount of unbiased EXP supervision and on how closely the target reward aligns with OBS-derived structure.

benchmark, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2605.01311

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.50)

Add feedback

First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint

Liu, Ziqi, Lee, Kiljae, Zhang, Yuan, Tang, Weijing

arXiv.org Machine LearningMay-5-2026

Probabilistic values, including Shapley values and semivalues, provide a model-agnostic framework to attribute the behavior of a black-box model to data points or features, with a wide range of applications including explainable artificial intelligence and data valuation. However, their exact computation requires utility evaluations over exponentially many coalitions, making Monte Carlo approximation essential in modern machine learning applications. Existing estimators are often developed through different identification strategies, including weighted averages, self-normalized weighting, regression adjustment, and weighted least squares. Our key observation is that these seemingly distinct constructions share a common first-order error structure, in which the leading term is an augmented inverse-probability weighted influence term determined by the sampling law and a working surrogate function. This first-order representation yields an explicit expression for the leading mean squared error (MSE), which characterizes how the sampling law and the surrogate jointly determine statistical efficiency. Guided by this criterion, we propose an Efficiency-Aware Surrogate-adjusted Estimator (EASE) that directly chooses the sampling law and surrogate to minimize the first-order MSE. We demonstrate that EASE consistently outperforms state-of-the-art estimators for various probabilistic values.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2605.02827

Country: North America > United States (0.45)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback