AITopics

Gaussian processes (GPs) are powerful and widely used probabilistic regression models, but their effectiveness in practice is often limited by the choice of kernel function. This kernel function is typically handcrafted from a small set of standard functions, a process that requires expert knowledge, results in limited adaptivity to data, and imposes strong assumptions on the hypothesis space. We study Empirical GPs, a principled framework for constructing flexible, data-driven GP priors that overcome these limitations. Rather than relying on standard parametric kernels, we estimate the mean and covariance functions empirically from a corpus of historical observations, enabling the prior to reflect rich, non-trivial covariance structures present in the data. Theoretically, we show that the resulting model converges to the GP that is closest (in KL-divergence sense) to the real data generating process. Practically, we formulate the problem of learning the GP prior from independent datasets as likelihood estimation and derive an Expectation-Maximization algorithm with closed-form updates, allowing the model handle heterogeneous observation locations across datasets. We demonstrate that Empirical GPs achieve competitive performance on learning curve extrapolation and time series forecasting benchmarks.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2602.12082

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Oceania > Samoa (0.04)
Oceania > American Samoa (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Liautaud, Paul, Gaillard, Pierre, Wintenberger, Olivier

High-Probability Minimax Adaptive Estimation in Besov Spaces via Online-to-Batch

We study nonparametric regression over Besov spaces from noisy observations under sub-exponential noise, aiming to achieve minimax-optimal guarantees on the integrated squared error that hold with high probability and adapt to the unknown noise level. To this end, we propose a wavelet-based online learning algorithm that dynamically adjusts to the observed gradient noise by adaptively clipping it at an appropriate level, eliminating the need to tune parameters such as the noise variance or gradient bounds. As a by-product of our analysis, we derive high-probability adaptive regret bounds that scale with the $\ell_1$-norm of the competitor. Finally, in the batch statistical setting, we obtain adaptive and minimax-optimal estimation rates for Besov spaces via a refined online-to-batch conversion. This approach carefully exploits the structure of the squared loss in combination with self-normalized concentration inequalities.

artificial intelligence, machine learning, sup 1nullt, (19 more...)

2602.11747

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.34)

Lee, Kyungbok, Sarteau, Angelica Cristello, Kosorok, Michael R.

Provable Offline Reinforcement Learning for Structured Cyclic MDPs

artificial intelligence, machine learning, provable offline reinforcement learning, (13 more...)

We introduce a novel cyclic Markov decision process (MDP) framework for multi-step decision problems with heterogeneous stage-specific dynamics, transitions, and discount factors across the cycle. In this setting, offline learning is challenging: optimizing a policy at any stage shifts the state distributions of subsequent stages, propagating mismatch across the cycle. To address this, we propose a modular structural framework that decomposes the cyclic process into stage-wise sub-problems. While generally applicable, we instantiate this principle as CycleFQI, an extension of fitted Q-iteration enabling theoretical analysis and interpretation. It uses a vector of stage-specific Q-functions, tailored to each stage, to capture within-stage sequences and transitions between stages. This modular design enables partial control, allowing some stages to be optimized while others follow predefined policies. We establish finite-sample suboptimality error bounds and derive global convergence rates under Besov regularity, demonstrating that CycleFQI mitigates the curse of dimensionality compared to monolithic baselines. Additionally, we propose a sieve-based method for asymptotic inference of optimal policy values under a margin condition. Experiments on simulated and real-world Type 1 Diabetes data sets demonstrate CycleFQI's effectiveness.

2602.11679

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report > Experimental Study (0.45)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Education > Health & Safety > School Nutrition (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Karim, Sunny R., Nielsen, Morten Ørregaard, MacKinnon, James G., Webb, Matthew D.

Improved Inference for CSDID Using the Cluster Jackknife

Obtaining reliable inferences with traditional difference-in-differences (DiD) methods can be difficult. Problems can arise when both outcomes and errors are serially correlated, when there are few clusters or few treated clusters, when cluster sizes vary greatly, and in various other cases. In recent years, recognition of the ``staggered adoption'' problem has shifted the focus away from inference towards consistent estimation of treatment effects. One of the most popular new estimators is the CSDID procedure of Callaway and Sant'Anna (2021). We find that the issues of over-rejection with few clusters and/or few treated clusters are at least as severe for CSDID as for traditional DiD methods. We also propose using a cluster jackknife for inference with CSDID, which simulations suggest greatly improves inference. We provide software packages in Stata csdidjack and R didjack to calculate cluster-jackknife standard errors easily.

artificial intelligence, att, machine learning, (17 more...)

2602.12043

Country:

North America > United States > Indiana (0.05)
North America > United States > Wisconsin (0.04)
North America > United States > South Carolina (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsFeb-12-2026, 23:52:49 GMT

Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization

Xiaoxuan Zhang, Mingrui Liu, Xun Zhou, Tianbao Yang

Neural Information Processing Systems http://nips.cc/

algorithm, optimal threshold, posterior probability, (10 more...)

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Quebec > Montreal (0.04)

Industry: Education > Educational Setting > Online (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Neural Information Processing SystemsFeb-12-2026, 23:52:23 GMT

4fd5cfd2e31bebbccfa5ffa354c04bdc-Paper-Conference.pdf

dataset, language model, tabular data, (16 more...)

Country:

North America > United States > California (0.04)
Asia > Pakistan (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Government (1.00)
(6 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(7 more...)

Neural Information Processing SystemsFeb-12-2026, 23:32:31 GMT

65b0df23fd2d449ae1e4b2d27151d73b-Paper.pdf

artificial intelligence, machine learning, posterior, (17 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsFeb-12-2026, 23:32:16 GMT

Zeroth-OrderNegativeCurvatureFinding: Escaping SaddlePointswithoutGradients

Several classical results have shown that, forρ-Hessian Lipschitz functions (see Definition 1), using the second-order information like computing the Hessian [33] or Hessian-vector products [1, 9, 2], one can find anϵ-approximate second-order stationary point (SOSP, f(x) ϵ and 2f(x) ρϵI).

algorithm, artificial intelligence, machine learning, (16 more...)