AITopics | jiang

Bellman-consistent Pessimism for Offline Reinforcement Learning

Neural Information Processing SystemsApr-25-2026, 10:48:37 GMT

The use of pessimism, when reasoning about datasets lacking exhaustive exploration, has recently gained prominence in offline reinforcement learning. Despite the robustness it adds to the algorithm, overly pessimistic reasoning can be equally damaging in precluding the discovery of good policies, which is an issue for the popular bonus-based pessimism. In this paper, we introduce the notion of Bellmanconsistent pessimism for general function approximation: instead of calculating a point-wise lower bound for the value function, we implement pessimism at the initial state over the set of functions consistent with the Bellman equations. Our theoretical guarantees only require Bellman closedness as standard in the exploratory setting, in which case bonus-based pessimism fails to provide guarantees. Even in the special case of linear function approximation where stronger expressivity assumptions hold, our result improves upon a recent bonus-based approach by O(d) in its sample complexity when the action space is finite and small. Remarkably, our algorithms automatically adapt to the best bias-variance tradeoff in the hindsight, whereas most prior approaches require tuning extra hyperparameters a priori.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

CRAG - Comprehensive RAG Benchmark

Neural Information Processing SystemsMar-18-2026, 07:54:41 GMT

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA.

artificial intelligence, large language model, natural language, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

9bb93a3c1a424654aaea6f5b594e94d5-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 02:51:24 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Czechia > Prague (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Neural Information Processing SystemsFeb-16-2026, 02:51:20 GMT

We provide both theoretical analysis and experimental results to validate the effectiveness of our proposed algorithm.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Czechia > Prague (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels

Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, Pramod Viswanath

Neural Information Processing SystemsFeb-11-2026, 16:58:53 GMT

Autoencoder is a powerful unsupervised learning framework to learn latent representations by minimizing reconstruction loss of the input data [1]. Autoencoders have been widely used in unsupervised learning tasks such as representation learning [1] [2], denoising [3], and generative model [4][5]. Most autoencoders are under-complete autoencoders, for which the latent space is smaller than the input data [2]. Over-complete autoencoders have latent space larger than input data.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
Europe > Austria > Vienna (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Oracle Inequalitiesfor Model Selection in Offline Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 11:56:56 GMT

Define = log (M2H / ).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.41)

Add feedback

LowerBound

Neural Information Processing SystemsFeb-11-2026, 05:41:36 GMT

Then, we consider sufficient assumptions under which learning good policies requires polynomial number of episodes.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

MaskPlace: FastChipPlacementviaReinforced VisualRepresentationLearning

Neural Information Processing SystemsFeb-10-2026, 22:35:59 GMT

It has several appealing benefits that prior arts donothave.

artificial intelligence, machine learning, maskplace, (15 more...)

Neural Information Processing Systems

Country: Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

EfficientNonmyopicBayesianOptimizationvia One-ShotMulti-StepTrees

Neural Information Processing SystemsFeb-10-2026, 12:16:15 GMT

Most of the existing acquisition policies are only one-step optimal, that is, optimal if the decision horizonwereone.

artificial intelligence, machine learning, optimization, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Optimaland Adaptive Monteiro-Svaiter Acceleration

Neural Information Processing SystemsFeb-10-2026, 06:25:10 GMT

Corollary 3.Consider Algorithm 1 withinitialpointx0, parameters satisfying 1.1 = O( 1)and 00, and -MSoracleOaMSN (with LAZY= Trueinallbutthefirstiteration) with 2 (0.01,0.99).

algorithm 1, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: