AITopics | assum

0bd7c1a579459520e4d731a14b7bda7d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 18:25:15 GMT

estimator, kernel, probability, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Average-reward reinforcement learning in semi-Markov decision processes via relative value iteration

Yu, Huizhen, Wan, Yi, Sutton, Richard S.

arXiv.org Artificial IntelligenceDec-9-2025

This paper applies the authors' recent results on asynchronous stochastic approximation (SA) in the Borkar-Meyn framework to reinforcement learning in average-reward semi-Markov decision processes (SMDPs). We establish the convergence of an asynchronous SA analogue of Schweitzer's classical relative value iteration algorithm, RVI Q-learning, for finite-space, weakly communicating SMDPs. In particular, we show that the algorithm converges almost surely to a compact, connected subset of solutions to the average-reward optimality equation, with convergence to a unique, sample path-dependent solution under additional stepsize and asynchrony conditions. Moreover, to make full use of the SA framework, we introduce new monotonicity conditions for estimating the optimal reward rate in RVI Q-learning. These conditions substantially expand the previously considered algorithmic framework and are addressed through novel arguments in the stability and convergence analysis of RVI Q-learning.

assum, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2512.06218

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Counterfactual Forecasting For Panel Data

Deb, Navonil, Dwivedi, Raaz, Basu, Sumanta

arXiv.org Machine LearningNov-11-2025

We address the challenge of forecasting counterfactual outcomes in a panel data with missing entries and temporally dependent latent factors -- a common scenario in causal inference, where estimating unobserved potential outcomes ahead of time is essential. We propose Forecasting Counterfactuals under Stochastic Dynamics (Focus), a method that extends traditional matrix completion methods by leveraging time series dynamics of the factors, thereby enhancing the prediction accuracy of future counterfactuals. Building upon a PCA estimator, our method accommodates both stochastic and deterministic components within the factors, and provides a flexible framework for various applications. In case of stationary autoregressive factors and under standard conditions, we derive error bounds and establish asymptotic normality of our estimator. Empirical evaluations demonstrate that our method outperforms existing benchmarks when the latent factors have an autoregressive component. We illustrate Focus results on HeartSteps, a mobile health study, illustrating its effectiveness in forecasting step counts for users receiving activity prompts, thereby leveraging temporal patterns in user behavior.

artificial intelligence, assum, machine learning, (17 more...)

arXiv.org Machine Learning

2511.06189

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

0bd7c1a579459520e4d731a14b7bda7d-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:51:01 GMT

estimator, kernel, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Learning Counterfactual Distributions via Kernel Nearest Neighbors

Choi, Kyuseong, Feitelberg, Jacob, Chin, Caleb, Agarwal, Anish, Dwivedi, Raaz

arXiv.org Machine LearningDec-2-2024

Consider a setting with multiple units (e.g., individuals, cohorts, geographic locations) and outcomes (e.g., treatments, times, items), where the goal is to learn a multivariate distribution for each unit-outcome entry, such as the distribution of a user's weekly spend and engagement under a specific mobile app version. A common challenge is the prevalence of missing not at random data, where observations are available only for certain unit-outcome combinations and the observation availability can be correlated with the properties of distributions themselves, i.e., there is unobserved confounding. An additional challenge is that for any observed unit-outcome entry, we only have a finite number of samples from the underlying distribution. We tackle these two challenges by casting the problem into a novel distributional matrix completion framework and introduce a kernel based distributional generalization of nearest neighbors to estimate the underlying distributions. By leveraging maximum mean discrepancies and a suitable factor model on the kernel mean embeddings of the underlying distributions, we establish consistent recovery of the underlying distributions even when data is missing not at random and positivity constraints are violated. Furthermore, we demonstrate that our nearest neighbors approach is robust to heteroscedastic noise, provided we have access to two or more measurements for the observed unit-outcome entries, a robustness not present in prior works on nearest neighbors with single measurements.

assum, kernel -nn, missingness, (16 more...)

arXiv.org Machine Learning

2410.13381

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)

Add feedback

Supervised Kernel Thinning

Gong, Albert, Choi, Kyuseong, Dwivedi, Raaz

arXiv.org Machine LearningOct-17-2024

The kernel thinning algorithm of Dwivedi & Mackey (2024) provides a better-than-i.i.d. compression of a generic set of points. By generating high-fidelity coresets of size significantly smaller than the input points, KT is known to speed up unsupervised tasks like Monte Carlo integration, uncertainty quantification, and non-parametric hypothesis testing, with minimal loss in statistical accuracy. In this work, we generalize the KT algorithm to speed up supervised learning problems involving kernel methods. Specifically, we combine two classical algorithms--Nadaraya-Watson (NW) regression or kernel smoothing, and kernel ridge regression (KRR)--with KT to provide a quadratic speed-up in both training and inference times. We show how distribution compression with KT in each setting reduces to constructing an appropriate kernel, and introduce the Kernel-Thinned NW and Kernel-Thinned KRR estimators. We prove that KT-based regression estimators enjoy significantly superior computational efficiency over the full-data estimators and improved statistical efficiency over i.i.d. subsampling of the training data. En route, we also provide a novel multiplicative error guarantee for compressing with KT. We validate our design choices with both simulations and real data experiments.

estimator, kernel, probability, (15 more...)

arXiv.org Machine Learning

2410.13749

Country: