proposition 4
Fast Bellman Updates for Wasserstein Distributionally Robust MDPs
Markov decision processes (MDPs) often suffer from the sensitivity issue under model ambiguity. In recent years, robust MDPs have emerged as an effective framework to overcome this challenge. Distributionally robust MDPs extend the robust MDP framework by incorporating distributional information of the uncertain model parameters to alleviate the conservative nature of robust MDPs.
- Asia > China > Hong Kong (0.04)
- North America > United States > Massachusetts (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.92)
- Government (0.46)
- Leisure & Entertainment > Games (0.46)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Minnesota (0.04)
- (7 more...)
- North America > United States > Colorado (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Singapore (0.04)
- Asia > China > Zhejiang Province (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
A Derivations of Variance Controlled Diffusion
A.1 Proof of Proposition 4.1 Proposition 4.1 For any bounded measurable function τ(t): [0, T ] R, the following Reverse SDEs [ (1 + τ Eq. (20) is a reverse-time SDE running[ from T to 0, thus (there)are two additional minus ] signs in Eq. (21) before term A.2 Two Reparameterizations and Exact Solution under Exponential Integrator In this subsection, we will show the exact solution of SDE in both data prediction reparameterization and noise prediction reparameterization. The noise term in data prediction has smaller variance than noise prediction ones, implying the necessity of adopting data prediction reparameterization for the SDE sampler. The computation of variance uses the Itô Isometry, which is a crucial fact of Itô integral. Similar with Proposition 4.2, Eq. (37) can be solved analytically, which is shown in the following propositon: Following the derivation in Proposition 4.2, the mean of the Itô integral term is: [ A.2.4 Comparison between Data and Noise Reparameterizations In Table 1 we perform an ablation study on data and noise reparameterizations, the experiment results show that under the same magnitude of stochasticity, the proposed SA-Solver in data reparameterization has a better convergence which leads to better FID results under the same NFEs. In this subsection, we provide a theoretical view of this phenomenon.
- Media (0.46)
- Leisure & Entertainment (0.46)
Data-driven Optimal Filtering for Linear Systems with Unknown Noise Covariances
This paper examines learning the optimal filtering policy, known as the Kalman gain, for a linear system with unknown noise covariance matrices using noisy output data. The learning problem is formulated as a stochastic policy optimization problem, aiming to minimize the output prediction error. This formulation provides a direct bridge between data-driven optimal control and, its dual, optimal filtering.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)