AITopics

2605.16645

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Schlaginhaufen, Andreas, Kamgarpour, Maryam

Fast Rates for Inverse Reinforcement Learning

arXiv.org Machine LearningMay-15-2026

We establish novel structural and statistical results for entropy-regularized min-max inverse reinforcement learning (Min-Max-IRL) with linear reward classes in finite-horizon MDPs with Borel state and action spaces. On the structural side, we show that maximum likelihood estimation (MLE) and Min-Max-IRL are equivalent at the population level, and at the empirical level under deterministic dynamics. On the statistical side, exploiting pseudo-self-concordance of the Min-Max-IRL loss, we prove that both the trajectory-level KL divergence and the squared parameter error in the Hessian norm decay at the fast rate $\mathcal{O}(n^{-1})$, where $n$ is the number of expert trajectories. Our guarantees apply under misspecification and require no exploration assumptions. We further extend reward-identifiability results to general Borel spaces and derive novel results on the derivatives of the soft-optimal value function with respect to reward parameters.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

2605.14599

Genre: Research Report (0.64)

Linkerhägner, Jonas, Bortolasi, Michele, Baldassari, Lorenzo, de Hoop, Maarten V., Dokmanić, Ivan

On Observation Time for Recovering Latent Hawkes Networks

arXiv.org Machine LearningMay-12-2026

Dynamics of interacting systems in engineering, society, and nature often evolve over latent networks that govern which entities can interact. We study the problem of inferring these networks from event-based observations, which arise naturally in finance, seismology, and neuroscience. While there is substantial algorithmic work addressing this important problem, theoretical results are scarce. In this paper we ask the following fundamental question: what is the minimum time that one must observe the dynamics in order to exactly recover the underlying network, as a function of the number $d$ of interacting entities? For a class of stationary Hawkes processes with sparse, weak interactions, we prove that an observation time of order $\log d$ is sufficient and necessary. For the upper bound we construct a two-stage estimator that uses clipped and binned event data for screening, followed by a least-squares refinement, and apply concentration bounds derived from the Poisson cluster representation. For the lower bound we combine Fano's inequality with Jacod's Girsanov formula for point processes on a suitable subclass of networks.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2605.084

Country: North America > United States (0.92)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)

arXiv.org Machine LearningMay-7-2026

Regime-Conditioned Evaluation in Multi-Context Bayesian Optimization

Thomas, Noel

Published transfer-BO comparisons often estimate an average treatment effect of acquisition choice over hidden regime variables, while practitioners need the conditional effect for their specific prior quality, budget ratio, and metric. An audit of 40 transfer-BO papers from NeurIPS, ICML, ICLR, AISTATS, UAI, TMLR, JMLR, and AutoML-Conf (2022-2025) finds that 98% never vary B/|A| as a controlled axis. On the same GDSC2 benchmark, changing only the budget reverses the ranking: at B=50, Greedy outperforms UCB by 0.050 Hit@1, while at B=100, UCB outperforms Greedy by 0.035. We capture this transition with the Portable Regime Score PRS=(B/|A|)(1-rho), where rho is the prior rank correlation and can be estimated from pilot contexts before the main comparison. Across 79 conditions spanning chemistry, drug-response biology, and HPO, a hierarchical model gives beta=0.50 (p=1.1e-9), and 19% of conditions fall in an equivalence zone where |advantage|<0.01 Hit@1. In five published reversal cases, PRS predicts the winner from pre-comparison observables. A No-Free-Leaderboard proposition explains why unconditional rankings are unstable: when CATE changes sign across regimes, the reported ATE becomes a function of benchmark mixture. RegimePlanner, which estimates rho online and switches acquisition accordingly, wins all 16 HPO-B search spaces at B=100 and exceeds the matched {Greedy,UCB} per-context oracle on GDSC2 by 18%. Pre-registered predictions achieve 27/40=67.5% overall accuracy and above 90% within EMA prior families. The practical protocol is simple: report B/|A|, rho, K, and metric alongside any claimed acquisition advantage.

benchmark, machine learning, natural language, (19 more...)

2605.04895

Country: Asia > Middle East > UAE (0.27)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Neural Information Processing SystemsApr-30-2026, 19:48:33 GMT

Supplementary for Neural Methods for Point-wise Dependency Estimation

In this section, we shall show detailed derivations for the point-wise dependency estimation methods. Four approaches are discussed: Variational Bounds of Mutual Information, Density Matching, Probabilistic Classifier, and Density-Ratio Fitting. For convenience, we define Ω = X Y. We have PX,Y and PXPY (can also be written as PX PY) be the probability measures over σ algebras over Ω with their probability densities being the Radon-Nikodym derivatives (i.e., p(x,y) = dPX,Y/dµ and p(x)p(y) = dPXPY/dµwith µbeing the Lebesgue measure). These estimators have the logarithm of point-wise dependency (PMI) as the intermediate product, which we will show in the following. We denote Mbe any class of functions m: Ω R. Proposition 1 (INWJ and its neural estimation, restating Nguyen-Wainwright-Jordan bound [5, 18]).

artificial intelligence, machine learning, objective, (15 more...)

Country: Asia > Middle East > Jordan (0.24)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Neural Information Processing SystemsApr-29-2026, 08:40:03 GMT

aa5f224975a67914067519faddeacba3-Supplemental-Conference.pdf

artificial intelligence, machine learning, vec, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsApr-25-2026, 09:03:42 GMT

3152e3b1e52e2cb123363787d5f76c95-Supplemental.pdf

artificial intelligence, machine learning, projection, (17 more...)

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Neural Information Processing SystemsApr-25-2026, 06:52:16 GMT

Supplementary materials AOn the Definition of LOTr,c

Let (X,dX) and (Y,dY) two nonempty compact Polish spaces, µ 2M +1 (X), 2M +1 (Y) two probability measures on these spaces and c: X Y! R+ a nonnegative and continuous function. As X and Y are compact, r(µ,) is tight, then Prokhorov's theorem applies and the closure of r(µ,) is sequentially compact. Let us now show that r(µ,) is closed. Indeed, Let ( n)n 0 a sequence of r(µ,) converging towards . In addition as ( n)n 0 live in the simplex r, we can also extract a sub-sequence, such that n! 2 r.

artificial intelligence, lotr, probability, (17 more...)

Technology: Information Technology > Artificial Intelligence (0.46)

Neural Information Processing SystemsApr-25-2026, 06:52:12 GMT

Low-rank Optimal Transport: Approximation, Statistics and Debiasing

The matching principles behind optimal transport (OT) play an increasingly important role in machine learning, a trend which can be observed when OT is used to disambiguate datasets in applications (e.g.

algorithm, artificial intelligence, machine learning, (16 more...)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)