tangent space
Functional Natural Policy Gradients
Bibaut, Aurelien, Zenati, Houssam, Rahier, Thibaud, Kallus, Nathan
Personalized decision policies are increasingly central in areas such as healthcare [Bertsimas et al., 2017], education[Mandeletal.,2014], andpublicpolicy[Kubeetal.,2019], wheretailoringactions to individual characteristics can improve outcomes. In many of these settings, however, actively experimenting with new policies to generate "online data" is expensive, risky, or infeasible, which motivates methods that can evaluate and optimize policies using pre-existing "offline data." A variety of work studies semiparametric efficient estimation of the value of a fixed policy from offline data [Chernozhukov et al., 2018, Dud ık et al., 2011, Jiang and Li, 2016, Kallus and Uehara, 2020, 2022, Kallus et al., 2022, Scharfstein et al., 1999]. And, a variety of work considers selecting the policy that optimizes such estimates over policies in a given class [Athey and Wager, 2021, Chernozhukov et al., 2019, Foster and Syrgkanis, 2023, Kallus, 2021, Zhang et al., 2013, Zhou et al., 2023], which generally yields rates the scale with policy class complexity, e.g., OP(N 1/2) for VC classes. Luedtke and Chambaz [2020] get regret acceleration to oP(N 1/2) by leveraging an equicontinuity argument.
Inversion-Free Natural Gradient Descent on Riemannian Manifolds
Draca, Dario, Matsubara, Takuo, Tran, Minh-Ngoc
The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions whose parameters lie on a Riemannian manifold. The manifold setting offers several advantages: one can implicitly enforce parameter constraints such as positive definiteness and orthogonality, ensure parameters are identifiable, or guarantee regularity properties of the objective like geodesic convexity. Building on an intrinsic formulation of the Fisher information matrix (FIM) on a manifold, our method maintains an online approximation of the inverse FIM, which is efficiently updated at quadratic cost using score vectors sampled at successive iterates. In the Riemannian setting, these score vectors belong to different tangent spaces and must be combined using transport operations. We prove almost-sure convergence rates of $O(\log{s}/s^α)$ for the squared distance to the minimizer when the step size exponent $α>2/3$. We also establish almost-sure rates for the approximate FIM, which now accumulates transport-based errors. A limited-memory variant of the algorithm with sub-quadratic storage complexity is proposed. Finally, we demonstrate the effectiveness of our method relative to its Euclidean counterparts on variational Bayes with Gaussian approximations and normalizing flows.
- Europe > Belarus > Minsk Region > Minsk (0.04)
- Asia > Middle East > Jordan (0.04)
- South America > Argentina (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Europe > Middle East > Cyprus (0.04)
- North America > United States > California (0.04)
- North America > Canada > Ontario > Middlesex County > London (0.04)
- (3 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Asia > China > Hong Kong (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- North America > United States > North Carolina (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Information Technology > Communications (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- North America > United States (0.28)
- Europe > Germany > Baden-Württemberg (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine (0.68)
- Health & Medicine > Health Care Technology (0.68)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (2 more...)