AITopics | Markov Models

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Neural Information Processing SystemsOct-2-2025, 17:21:14 GMT

Solving OPE is often the starting point in many RL applications. To tackle the problem of OPE, the idea of importance sampling (IS) corrects the mismatch in the distributions under the behavior policy and target policy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Budgeted Reinforcement Learning in Continuous State Space

Neural Information Processing SystemsOct-2-2025, 17:17:53 GMT

So far, BMDPs could only be solved in the case of finite state spaces with known dynamics. This work extends the state-of-the-art to continuous spaces environments and unknown dynamics. We show that the solution to a BMDP is a fixed point of a novel Budgeted Bellman Optimality operator. This observation allows us to introduce natural extensions of Deep Reinforcement Learning algorithms to address large-scale BMDPs.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Europe > France (0.28)

Industry: Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Counterfactual Vision-and-Language Navigation: Unravelling the Unseen

Neural Information Processing SystemsOct-2-2025, 16:43:35 GMT

We propose a new learning strategy that learns both from observations and generated counterfactual environments.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.70)
(2 more...)

Add feedback

Decision-Making with Auto-Encoding Variational Bayes Romain Lopez

Neural Information Processing SystemsOct-2-2025, 16:07:44 GMT

This approach yields biased estimates of the expected risk, and therefore leads to poor decisions for two reasons.

artificial intelligence, machine learning, variational distribution, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

350a7f5ee27d22dbe36698b10930ff96-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 15:58:58 GMT

artificial intelligence, knockoff, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Fast Bidirectional Probability Estimation in Markov Models

Siddhartha Banerjee, Peter Lofgren

Neural Information Processing SystemsOct-2-2025, 15:53:21 GMT

We develop a new bidirectional algorithm for estimating Markov chain multi-step transition probabilities: given a Markov chain, we want to estimate the probability of hitting a given target state in null steps after starting from a given source distribution. Given the target state t, we use a (reverse) local power iteration to construct an'expanded target distribution', which has the same mean as the quantity we want to estimate, but a smaller variance - this can then be sampled efficiently by a Monte Carlo algorithm. Our method extends to any Markov chain on a discrete (finite or countable) state-space, and can be extended to compute functions of multi-step transition probabilities such as PageRank, graph diffusions, hitting/return times, etc. Our main result is that in'sparse' Markov Chains - wherein the number of transitions between states is comparable to the number of states - the running time of our algorithm for a uniform-random target node is order-wise smaller than Monte Carlo and power iteration based algorithms; in particular, our method can estimate a probability p using only O (1/ p) running time.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Adaptive Stochastic Optimization: From Sets to Paths

Zhan Wei Lim, David Hsu, Wee Sun Lee

Neural Information Processing SystemsOct-2-2025, 15:04:01 GMT

It plays a crucial role in planning and learning under uncertainty, but is, unfortunately, computationally intractable in general. This paper introduces two conditions on the objective function, the marginal likelihood rate bound and the marginal likelihood bound, which, together with pointwise submodularity, enable efficient approximate solution of ASO. Several interesting classes of functions satisfy these conditions naturally, e.g., the version space reduction function for hypothesis learning. We describe Recursive Adaptive Coverage, a new ASO algorithm that exploits these conditions, and apply the algorithm to two robot planning tasks under uncertainty. In contrast to the earlier submodu-lar optimization approach, our algorithm applies to ASO over both sets and paths .

adaptive stochastic optimization problem, algorithm, submodular, (11 more...)

Neural Information Processing Systems

Country: