AITopics | continuous state space

Collaborating Authors

continuous state space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Detecting Metastable Basins in High Dimensions via Marginal Trajectory Distribution Discrimination

Jones-McCormick, Taj

arXiv.org Machine LearningMay-26-2026

We study the problem of identifying dynamically distinct basins of attraction in high dimensional time-homogeneous Markov processes using only trajectory sampling. This problem is fundamental in the analysis of metastable dynamical systems, where the process rapidly mixes within basins while transitions between basins occur rarely on the timescale of interest, or even when the state space is reducible. Existing approaches typically rely on spatial discretization or spectral analysis of estimated transition operators, which can become unreliable in high dimensional settings or when the underlying basin geometry is highly nonlinear. We propose a discriminative approach to basin identification based on marginal trajectory distribution comparison. We prove a simple risk separation result: if two initial states belong to the same basin, the Bayes-optimal classifier distinguishing their marginal trajectory distributions achieves risk close to 1/2, whereas if they lie in distinct basins, the optimal risk is close to zero. This observation reduces basin detection to a two-sample discrimination problem between marginal trajectory distributions. Motivated by this principle, we develop a neural algorithm that receives a set of candidate basin representatives and iteratively merges them by estimating classification risk with a neural network that approximates the Bayes classifier. We evaluate the method on various metastable systems. These include synthetic systems constructed by embedding low-dimensional dynamics into high dimensional noisy ambient spaces. In these settings, standard spectral and clustering-based methods often fail, while our approach accurately recovers the underlying basin structure. These results display a shortcoming of existing methods and highlight trajectory discrimination as an effective tool for identifying dynamical basins in high dimensional stochastic systems.

artificial intelligence, basin, machine learning, (19 more...)

arXiv.org Machine Learning

2605.24136

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
(2 more...)

Add feedback

Supplementary material: Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees AProofs of lemmas and theorems

Neural Information Processing SystemsApr-25-2026, 11:51:13 GMT

A.1 Additional lemma Lemma 9 Let s0 be the starting state, let (a)n represent a sequence of actions and let M = Z(ar)Z(ar 1)...Z(a1) i.e., the product of matrices in {Z(a)}left multiplied in order of the sequence Proof Here we use proof by induction. We note that the interchange of the integral and infinite summation is justified by Section 3.7 in [5], since the coefficients Z We can then conclude the statement of the lemma by induction. A.2 Proof of Proposition 1 Proof By Lemma 9, given a fixed sequence of actions (a)n, the r-th state sr under this sequence of actions starting from state s0 has a distribution that can be represented over the basis {φn(s)}. Therefore, the expected reward under any sequence of actions for reward Ris the same as for the projected reward R0 for any state sr where r > 0. The reward at the starting state, R(s0) does not depend on the policy. Therefore, the value of R(s0) does not change whether a policy is optimal or not.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Solving Zero-Sum Markov Games with Continuous State via Spectral Dynamic Embedding Chenhao Zhou

Neural Information Processing SystemsFeb-16-2026, 03:14:48 GMT

In this paper, we propose a provably efficient natural policy gradient algorithm called Spectral Dynamic Embedding Policy Optimization ( SDEPO) for two-player zero-sum stochastic Markov games with continuous state space and finite action space. In the policy evaluation procedure of our algorithm, a novel kernel embedding method is employed to construct a finite-dimensional linear approximations to the state-action value function.

approximation, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

LearningGraphCellularAutomata

Neural Information Processing SystemsFeb-10-2026, 17:01:03 GMT

The transition rule is usually defined to update the state of each cell as a function of its current state and the state of its neighbours.

artificial intelligence, machine learning, transition rule, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Budgeted Reinforcement Learning in Continuous State Space

Neural Information Processing SystemsDec-25-2025, 08:53:34 GMT

A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints. It relies on a notion of risk implemented in the shape of an upper bound on a constrains violation signal that -- importantly -- can be modified in real-time. So far, BMDPs could only be solved in the case of finite state spaces with known dynamics. This work extends the state-of-the-art to continuous spaces environments and unknown dynamics. We show that the solution to a BMDP is the fixed point of a novel Budgeted Bellman Optimality operator. This observation allows us to introduce natural extensions of Deep Reinforcement Learning algorithms to address large-scale BMDPs. We validate our approach on two simulated applications: spoken dialogue and autonomous driving.

budgeted reinforcement learning, continuous state space, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Explicit Explore-Exploit Algorithms in Continuous State Spaces

Neural Information Processing SystemsDec-24-2025, 23:21:36 GMT

We present a new model-based algorithm for reinforcement learning (RL) which consists of explicit exploration and exploitation phases, and is applicable in large or infinite state spaces. The algorithm maintains a set of dynamics models consistent with current experience and explores by finding policies which induce high disagreement between their state predictions. It then exploits using the refined set of models or experience gathered during exploration. We show that under realizability and optimal planning assumptions, our algorithm provably finds a near-optimal policy with a number of samples that is polynomial in a structural complexity measure which we show to be low in several natural settings. We then give a practical approximation using neural networks and demonstrate its performance and sample efficiency in practice.

continuous state space, explicit explore-exploit algorithm, name change, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Neural Information Processing SystemsDec-24-2025, 00:07:54 GMT

Inverse Reinforcement Learning (IRL) is the problem of finding a reward function which describes observed/known expert behavior. The IRL setting is remarkably useful for automated control, in situations where the reward function is difficult to specify manually or as a means to extract agent preference. In this work, we provide a new IRL algorithm for the continuous state space setting with unknown transition dynamics by modeling the system using a basis of orthonormal functions. Moreover, we provide a proof of correctness and formal guarantees on the sample and time complexity of our algorithm. Finally, we present synthetic experiments to corroborate our theoretical guarantees.

continuous state space, inverse reinforcement learning, name change, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Finding Counterfactually Optimal Action Sequences in Continuous State Spaces

Neural Information Processing SystemsDec-23-2025, 20:08:53 GMT

Whenever a clinician reflects on the efficacy of a sequence of treatment decisions for a patient, they may try to identify critical time steps where, had they made different decisions, the patient's health would have improved. While recent methods at the intersection of causal inference and reinforcement learning promise to aid human experts, as the clinician above, to analyze sequential decision making processes, they have focused on environments with finitely many discrete states. However, in many practical applications, the state of the environment is inherently continuous in nature. In this paper, we aim to fill this gap. We start by formally characterizing a sequence of discrete actions and continuous states using finite horizon Markov decision processes and a broad class of bijective structural causal models. Building upon this characterization, we formalize the problem of finding counterfactually optimal action sequences and show that, in general, we cannot expect to solve it in polynomial time. Then, we develop a search method based on the A* algorithm that, under a natural form of Lipschitz continuity of the environment's dynamics, is guaranteed to return the optimal solution to the problem. Experiments on real clinical data show that our method is very efficient in practice, and it has the potential to offer interesting insights for sequential decision making tasks.

continuous state space, counterfactually optimal action sequence, name change, (3 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology: