AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.50)

Neural Information Processing SystemsFeb-9-2026, 05:45:48 GMT

Supplementary Material for A polynomial time algorithm for learning

This same algorithm can then be used to reconstruct the true DAGGfrom the true ordering . Once the ordering is known, existing nonlinear variable selection methods [4, 11, 16, 25, 28, 46] suffice to learn the parent setspa(j)and hence the graphG. In our experiments, we use exactly this procedure to learnGfrom the order, based on the data. There are two cases: (i)Bj =, and (ii)Bj 6= . If instead we haveσ23 = var(z3) = 1/3, the condition would be violated.

artificial intelligence, evar, machine learning, (16 more...)

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Neural Information Processing SystemsFeb-9-2026, 05:45:41 GMT

85c9f9efab89cee90a95cb98f15feacd-Paper.pdf

Instead, we impose a condition on the residual variances that is closely related to previous work on linear models with equal variances.

artificial intelligence, estimator, machine learning, (15 more...)

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Canary Islands (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-26-2025, 11:32:41 GMT

On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes

Optimizing static risk-averse objectives in Markov decision processes is difficult because they do not admit standard dynamic programming equations common in Reinforcement Learning (RL) algorithms. Dynamic programming decompositions that augment the state space with discrete risk levels have recently gained popularity in the RL community. Prior work has shown that these decompositions are optimal when the risk level is discretized sufficiently. However, we show that these popular decompositions for Conditional-Value-at-Risk (CVaR) and Entropic-Value-at-Risk (EVaR) are inherently suboptimal regardless of the discretization level. In particular, we show that a saddle point property assumed to hold in prior literature may be violated. However, a decomposition does hold for Value-at-Risk and our proof demonstrates how this risk measure differs from CVaR and EVaR. Our findings are significant because risk-averse algorithms are used in high-stake environments, making their correctness much more critical.

dynamic programming decomposition, name change, static risk measure, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Dold, Simon, Helmert, Malte, Nordström, Jakob, Röger, Gabriele, Schindler, Tanja

Pseudo-Boolean Proof Logging for Optimal Classical Planning

arXiv.org Artificial IntelligenceMay-6-2025

We introduce lower-bound certificates for classical planning tasks, which can be used to prove the unsolvability of a task or the optimality of a plan in a way that can be verified by an independent third party. We describe a general framework for generating lower-bound certificates based on pseudo-Boolean constraints, which is agnostic to the planning algorithm used. As a case study, we show how to modify the $A^{*}$ algorithm to produce proofs of optimality with modest overhead, using pattern database heuristics and $h^\textit{max}$ as concrete examples. The same proof logging approach works for any heuristic whose inferences can be efficiently expressed as reasoning over pseudo-Boolean constraints.

artificial intelligence, constraint, planning & scheduling, (18 more...)

2504.18443

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Neural Information Processing SystemsJan-19-2025, 17:43:13 GMT

On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes

dynamic programming decomposition, markov decision process, static risk measure, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Su, Xihong, Petrik, Marek, Grand-Clément, Julien

Stationary Policies are Optimal in Risk-averse Total-reward MDPs with EVaR

arXiv.org Artificial IntelligenceAug-30-2024

Optimizing risk-averse objectives in discounted MDPs is challenging because most models do not admit direct dynamic programming equations and require complex history-dependent policies. In this paper, we show that the risk-averse {\em total reward criterion}, under the Entropic Risk Measure (ERM) and Entropic Value at Risk (EVaR) risk measures, can be optimized by a stationary policy, making it simple to analyze, interpret, and deploy. We propose exponential value iteration, policy iteration, and linear programming to compute optimal policies. In comparison with prior work, our results only require the relatively mild condition of transient MDPs and allow for {\em both} positive and negative rewards. Our results indicate that the total reward criterion may be preferable to the discounted criterion in a broad range of risk-averse reinforcement learning domains.

artificial intelligence, machine learning, objective, (19 more...)

2408.17286

Country:

North America > United States > New Hampshire (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Hau, Jia Lin, Delage, Erick, Ghavamzadeh, Mohammad, Petrik, Marek

On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes

arXiv.org Artificial IntelligenceDec-13-2023

Optimizing static risk-averse objectives in Markov decision processes is difficult because they do not admit standard dynamic programming equations common in Reinforcement Learning (RL) algorithms. Dynamic programming decompositions that augment the state space with discrete risk levels have recently gained popularity in the RL community. Prior work has shown that these decompositions are optimal when the risk level is discretized sufficiently. However, we show that these popular decompositions for Conditional-Value-at-Risk (CVaR) and Entropic-Value-at-Risk (EVaR) are inherently suboptimal regardless of the discretization level. In particular, we show that a saddle point property assumed to hold in prior literature may be violated. However, a decomposition does hold for Value-at-Risk and our proof demonstrates how this risk measure differs from CVaR and EVaR. Our findings are significant because risk-averse algorithms are used in high-stakes environments, making their correctness much more critical.

cvar, decomposition, risk measure, (15 more...)

2304.12477

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > New Hampshire (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Ahmadi, Mohamadreza, Rosolia, Ugo, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.

Risk-Averse Decision Making Under Uncertainty

arXiv.org Artificial IntelligenceSep-9-2021

A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others. Traditionally, policy synthesis techniques are proposed such that a total expected cost or reward is minimized or maximized. However, optimality in the total expected cost sense is only reasonable if system behavior in the large number of runs is of interest, which has limited the use of such policies in practical mission-critical scenarios, wherein large deviations from the expected behavior may lead to mission failure. In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures, which we refer to as the constrained risk-averse problem. For MDPs, we reformulate the problem into a infsup problem via the Lagrangian framework and propose an optimization-based method to synthesize Markovian policies. For MDPs, we demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. For POMDPs, we show that, if the coherent risk measures can be defined as a Markov risk transition mapping, an infinite-dimensional optimization can be used to design Markovian belief-based policies. For stochastic finite-state controllers (FSCs), we show that the latter optimization simplifies to a (finite-dimensional) DCP and can be solved by the DCCP framework. We incorporate these DCPs in a policy iteration algorithm to design risk-averse FSCs for POMDPs.

coherent risk measure, constraint, risk measure, (15 more...)

2109.04082

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Ahmadi, Mohamadreza, Rosolia, Ugo, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.

Constrained Risk-Averse Markov Decision Processes

arXiv.org Artificial IntelligenceDec-4-2020

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.

artificial intelligence, optimization problem, risk measure, (16 more...)

2012.02423

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.48)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)