AITopics

Country: North America > United States (0.06)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Neural Information Processing SystemsFeb-16-2026, 19:26:39 GMT

Policy Gradient for Rectangular Robust Markov Decision Processes

However, they do not account for transition uncertainty, whereas learning robust policies can be computationally expensive.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Country:

North America > United States (0.06)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Neural Information Processing SystemsFeb-16-2026, 06:18:22 GMT

We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both

We argue that these properties are satisfied in many continuous state-action Markov decision processes.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-15-2025

Mean-Field Games with Constraints

Hu, Anran, Lyu, Zijiu

This paper introduces a framework of Constrained Mean-Field Games (CMFGs), where each agent solves a constrained Markov decision process (CMDP). This formulation captures scenarios in which agents' strategies are subject to feasibility, safety, or regulatory restrictions, thereby extending the scope of classical mean field game (MFG) models. We first establish the existence of CMFG equilibria under a strict feasibility assumption, and we further show uniqueness under a classical monotonicity condition. To compute equilibria, we develop Constrained Mean-Field Occupation Measure Optimization (CMFOMO), an optimization-based scheme that parameterizes occupation measures and shows that finding CMFG equilibria is equivalent to solving a single optimization problem with convex constraints and bounded variables. CMFOMO does not rely on uniqueness of the equilibria and can approximate all equilibria with arbitrary accuracy. We further prove that CMFG equilibria induce $O(1 / \sqrt{N})$-Nash equilibria in the associated constrained $N$-player games, thereby extending the classical justification of MFGs as approximations for large but finite systems. Numerical experiments on a modified Susceptible-Infected-Susceptible (SIS) epidemic model with various constraints illustrate the effectiveness and flexibility of the framework.

artificial intelligence, constraint, machine learning, (17 more...)

2510.11843

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.68)
Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Bicer, Osman, Kara, Ali D., Yuksel, Serdar

Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces

arXiv.org Artificial IntelligenceOct-15-2025

In this paper, for Markov decision processes (MDPs) with unbounded state spaces we present refined upper bounds presented in [Kara et. al. JMLR'23] on finite model approximation errors via optimizing the quantizers used for finite model approximations. We also consider implications on quantizer design for quantized Q-learning and empirical model learning, and the performance of policies obtained via Q-learning where the quantized state is treated as the state itself. We highlight the distinctions between planning, where approximating MDPs can be independently designed, and learning (either via Q-learning or empirical model learning), where approximating MDPs are restricted to be defined by invariant measures of Markov chains under exploration policies, leading to significant subtleties on quantizer design performance, even though asymptotic near optimality can be established under both setups. In particular, under Lyapunov growth conditions, we obtain explicit upper bounds which decay to zero as the number of bins approaches infinity

machine learning, reinforcement learning, theorem 2, (18 more...)

2510.04355

Country: North America (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Neural Information Processing SystemsOct-10-2025, 08:03:29 GMT

We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both

We argue that these properties are satisfied in many continuous state-action Markov decision processes.

experiment, function approximation, inequality, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-9-2025, 05:57:31 GMT

Policy Gradient for Rectangular Robust Markov Decision Processes Anonymous Author(s) Affiliation Address email

We provide a closed-form expression for the worst occupation measure.

artificial intelligence, machine learning, optimization problem, (18 more...)

Country: North America > United States (0.06)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Buehrle, Etienne, Stiller, Christoph

Stochastic Optimal Control via Measure Relaxations

arXiv.org Artificial IntelligenceSep-17-2025

The optimal control problem of stochastic systems is commonly solved via robust [2, 21] or scenario-based [7, 19, 17] optimization methods, which are both challenging to scale to long optimization horizons due to their open-loop nature. Dynamic programming formulations [4], while applicable to stochastic systems, typically involve nonconvex optimization problems and do not support specifying the terminal distribution. Polynomial optimization has been proposed for deterministic nonlinear [11] and hybrid systems [16]. We extend the method to stochastic systems using a weak formulation of the Fokker-Planck equation. As a cost function, we propose to use the Christoffel polynomial, which can be estimated from data.

artificial intelligence, cost function, optimization problem, (14 more...)

2508.00886

Country: Europe (0.15)

Genre: Research Report (0.40)

Industry: Energy (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.91)

arXiv.org Artificial IntelligenceSep-15-2024

Risk-Aware Autonomous Driving for Linear Temporal Logic Specifications

Qi, Shuhao, Zhang, Zengjie, Sun, Zhiyong, Haesaert, Sofie

Decision-making for autonomous driving incorporating different types of risks is a challenging topic. This paper proposes a novel risk metric to facilitate the driving task specified by linear temporal logic (LTL) by balancing the risk brought up by different uncertain events. Such a balance is achieved by discounting the costs of these uncertain events according to their timing and severity, thereby reflecting a human-like awareness of risk. We have established a connection between this risk metric and the occupation measure, a fundamental concept in stochastic reachability problems, such that a risk-aware control synthesis problem under LTL specifications is formulated for autonomous vehicles using occupation measures. As a result, the synthesized policy achieves balanced decisions across different types of risks with associated costs, showcasing advantageous versatility and generalizability. The effectiveness and scalability of the proposed approach are validated by three typical traffic scenarios in Carla simulator.

occupation measure, specification, vehicle, (14 more...)

2409.09769

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > Denmark > Central Jutland > Aarhus (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)