AITopics

doi: 10.5281/zenodo.19468379

2605.21492

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance (0.67)
Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

arXiv.org Machine LearningMay-20-2026

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Fanale, Raimondo

The main XAI attribution methods for deep neural networks -- GradCAM, SHAP, LIME, Integrated Gradients -- operate on separate theoretical foundations and are not formally comparable. We present GRALIS (Gradient-Riesz Averaged Locally-Integrated Shapley), a mathematical framework establishing a representation theory for attributions: every additive, linear, and continuous attribution functional on L^2(Q,mu) admits a unique canonical representation (Q, w, Delta), proved necessary by the Riesz Representation Theorem. This class encompasses SHAP, IG, LIME and linearized GradCAM, but excludes nonlinear functionals such as standard GradCAM or attention maps. Seven formal theorems provide simultaneous guarantees absent in any individual method: (T1) necessary canonical form; (T2) exact completeness; (T3) Monte Carlo convergence O(1/sqrt(m))+O(1/k); (T4) exact Shapley Interaction Values; (T5) Hoeffding ANOVA decomposition; (T6) Sobol sensitivity generalization; (T7) multi-scale extension (MS-GRALIS) with minimum-variance weights. An algebraic appendix justifies the GRALIS-SIV correspondence via the Mobius transform without circularity. GRALIS satisfies 13.5/14 axiomatic properties vs. 2.5-6/14 for individual methods, including completeness, sensitivity, locality, order-k interactions and optimal multi-scale aggregation simultaneously. Preliminary validation on BreaKHis (1,187 histology images, DenseNet-121) reports deletion faithfulness AUC +0.015 (malignant), 96% class-conditional consistency, SAL = 0.762+/-0.109 and sparsity index 0.39. Extended comparison with baseline XAI methods is planned for a companion paper.

artificial intelligence, gralis, machine learning, (20 more...)

2605.0548

Country: Europe > Italy (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningMay-11-2026

Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching

Li, Xiang, Jiang, Nan

We present a novel theoretical framework, Q-MMR, for off-policy evaluation in finite-horizon MDPs. Q-MMR learns a set of scalar weights, one for each data point, such that the reweighted rewards approximate the expected return under the target policy. The weights are learned inductively in a top-down manner via a moment matching objective against a value-function discriminator class. Notably, and perhaps surprisingly, a data-dependent finite-sample guarantee for general function approximation can be established under only the realizability of $Q^π$, with a dimension-free bound -- that is, the error does not depend on the statistical complexity of the function class. We also establish connections to several existing methods, such as importance sampling and linear FQE. Further theoretical analyses shed new light on the nature of coverage, a concept of fundamental importance to offline RL.

ddh, machine learning, reinforcement learning, (19 more...)

2605.06474

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Neural Information Processing SystemsMay-1-2026, 01:32:50 GMT

ComENet: Towards Complete and Efficient Message Passing for 3DMolecular Graphs

Many real-world data can be modeled as 3D graphs, but learning representations that incorporates 3D information completely and efficiently is challenging. Existing methods either use partial 3D information, or suffer from excessive computational cost. To incorporate 3D information completely and efficiently, we propose a novel message passing scheme that operates within 1-hop neighborhood.

artificial intelligence, completeness, machine learning, (18 more...)

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsApr-27-2026, 23:56:31 GMT

d6383e7643415842b48a5077a1b09c98-Supplemental-Conference.pdf

artificial intelligence, machine learning, probability, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsApr-24-2026, 20:16:49 GMT

Causal Identification under Markov equivalence: Calculus, Algorithm, and Completeness

One common task in many data sciences applications is to answer questions about the effect of new interventions, like: 'what would happen to Y if we make X equal to x while observing covariates Z = z?'. Formally, this is known as conditional effect identification, where the goal is to determine whether a post-interventional distribution is computable from the combination of an observational distribution and assumptions about the underlying domain represented by a causal diagram. A plethora of methods was developed for solving this problem, including the celebrated do-calculus [Pearl, 1995]. In practice, these results are not always applicable since they require a fully specified causal diagram as input, which is usually not available. In this paper, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data.

artificial intelligence, causal diagram, identification, (15 more...)

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.88)

Neural Information Processing SystemsApr-24-2026, 19:34:12 GMT

Supplementary Material: Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias

In this section we provide a detailed proof for the correctness and completeness of the ICD algorithm. For easier referencing we describe ICD in Algorithm 1, and describe the ICD-Sep conditions. A set Zis a subset of ICD-Sep(A,B) given r {0,...,|O| 2}, if and only if 1. |Z|= r, 2. Z Z, there exists a PDS-path ΠB(A,Z) such that, (a) |ΠB(A,Z)| r and (b) every node on ΠB(A,Z) is in Z, and 3. Z Z, node Z is a possible ancestor of Aor B (not a necessary condition). Denote A,B a pair of nodes from O that are connected in G and disconnected in D, and such that Ais not an ancestor of B in D. If A B |[Z0] S, where Z0 O is a minimal separating set having size n+ 1, then there exists a subset Z O having the same size of n+ 1 such that that A B |Z S, and for every node Z Zthere exists a PDS-path ΠB(A,Z) in G, such that every node V on the PDS-path is also in Z. Proof. It was previously shown that a minimal separating set for Aand B, where Ais not an ancestor of B, is a subset of D-Sep(A,B) (Spirtes et al., 2000, page 134 and Theorem 6.2; Spirtes et al., 1999).

artificial intelligence, machine learning, node, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.48)
Information Technology > Enterprise Applications > Customer Relationship Management (0.41)

Neural Information Processing SystemsApr-24-2026, 09:15:59 GMT

Contents of Appendix

Bayes-consistency only holds for the full family of measurable functions, which of course is distinct from the more restricted hypothesis set used by a learning algorithm. Therefore, a hypothesis setdependent notion of H-consistency has been proposed by Long and Servedio (2013) in the realizable setting, used by Zhang and Agarwal (2020) for linear models, and generalized by Kuznetsov et al. (2014) to the structured prediction case. Long and Servedio (2013) showed that there exists a case where a Bayes-consistent loss is not H-consistent while inconsistent losses can be H-consistent. Zhang and Agarwal (2020) further investigated the phenomenon in (Long and Servedio, 2013) and showed that the situation of losses that are not H-consistent with linear models can be remedied by carefully choosing a larger piecewise linear hypothesis set. Kuznetsov et al. (2014) proved positive results for the H-consistency of several multi-class ensemble algorithms, as an extension of H-consistency results in (Long and Servedio, 2013). Recently, the notions of H-calibration and H-consistency have been used by Bao et al. (2020); Awasthi et al. (2021a) in the study of adversarial binary classification losses, as defined in (Goodfellow et al., 2014; Madry et al., 2017; Tsipras et al., 2018; Carlini and Wagner, 2017; Awasthi et al., 2023).

artificial intelligence, machine learning, ymax, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningApr-3-2026

Observable Geometry of Singular Statistical Models

Plummer, Sean

Singular statistical models arise whenever different parameter values induce the same distribution, leading to non-identifiability and a breakdown of classical asymptotic theory. While existing approaches analyze these phenomena in parameter space, the resulting descriptions depend heavily on parameterization and obscure the intrinsic statistical structure of the model. In this paper, we introduce an invariant framework based on \emph{observable charts}: collections of functionals of the data distribution that distinguish probability measures. These charts define local coordinate systems directly on the model space, independent of parameterization. We formalize \emph{observable completeness} as the ability of such charts to detect identifiable directions, and introduce \emph{observable order} to quantify higher-order distinguishability along analytic perturbations. Our main result establishes that, under mild regularity conditions, observable order provides a lower bound on the rate at which Kullback-Leibler divergence vanishes along analytic paths. This connects intrinsic geometric structure in model space to statistical distinguishability and recovers classical behavior in regular models while extending naturally to singular settings. We illustrate the framework in reduced-rank regression and Gaussian mixture models, where observable coordinates reveal both identifiable structure and singular degeneracies. These results suggest that observable charts provide a unified and parameterization-invariant language for studying singular models and offer a pathway toward intrinsic formulations of invariants such as learning coefficients.

artificial intelligence, machine learning, observable chart, (17 more...)

2604.01267

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)