vi problem
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Transportation > Infrastructure & Services (1.00)
- Government (1.00)
- Social Sector (0.68)
In most cases, the game designer is expected to first learn about the agents
We would like to thank all reviewers for reading our paper and providing constructive comments. Sometimes, the primary interest is to understand agent behaviors, and hence only the learning mode is needed. Alternatively, when all game inputs are known, the focus is on the intervention mode. In the final version, we will (i) explain in 2.1 how these We agree that it is neither rigorous nor necessary to assert that "most" Our work is inspired by the current interests on complex optimization-based layers. It is the first to treat VIs as individual layers in the end-to-end framework.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Transportation > Infrastructure & Services (1.00)
- Government (1.00)
- Social Sector (0.68)
Solving Hidden Monotone Variational Inequalities with Surrogate Losses
D'Orazio, Ryan, Vucetic, Danilo, Liu, Zichu, Kim, Junhyung Lyle, Mitliagkas, Ioannis, Gidel, Gauthier
Deep learning has proven to be effective in a wide variety of loss minimization problems. However, many applications of interest, like minimizing projected Bellman error and min-max optimization, cannot be modelled as minimizing a scalar loss function but instead correspond to solving a variational inequality (VI) problem. This difference in setting has caused many practical challenges as naive gradient-based approaches from supervised learning tend to diverge and cycle in the VI case. In this work, we propose a principled surrogate-based approach compatible with deep learning to solve VIs. We show that our surrogate-based approach has three main benefits: (1) under assumptions that are realistic in practice (when hidden monotone structure is present, interpolation, and sufficient optimization of the surrogates), it guarantees convergence, (2) it provides a unifying perspective of existing methods, and (3) is amenable to existing deep learning optimizers like ADAM. Experimentally, we demonstrate our surrogate-based approach is effective in min-max optimization and minimizing projected Bellman error. Furthermore, in the deep reinforcement learning case, we propose a novel variant of TD(0) which is more compute and sample efficient.
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Texas > Harris County > Houston (0.04)
- Europe > Russia (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)
Tracking solutions of time-varying variational inequalities
Hadiji, Hédi, Sachs, Sarah, Guzmán, Cristóbal
Tracking the solution of time-varying variational inequalities is an important problem with applications in game theory, optimization, and machine learning. Existing work considers time-varying games or time-varying optimization problems. For strongly convex optimization problems or strongly monotone games, these results provide tracking guarantees under the assumption that the variation of the time-varying problem is restrained, that is, problems with a sublinear solution path. In this work we extend existing results in two ways: In our first result, we provide tracking bounds for (1) variational inequalities with a sublinear solution path but not necessarily monotone functions, and (2) for periodic time-varying variational inequalities that do not necessarily have a sublinear solution path-length. Our second main contribution is an extensive study of the convergence behavior and trajectory of discrete dynamical systems of periodic time-varying VI. We show that these systems can exhibit provably chaotic behavior or can converge to the solution. Finally, we illustrate our theoretical results with experiments.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile (0.04)
- (2 more...)
Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation
Kotsalis, Georgios, Lan, Guanghui, Li, Tianjiao
In this paper we first present a novel operator extrapolation (OE) method for solving deterministic variational inequality (VI) problems. Similar to the gradient (operator) projection method, OE updates one single search sequence by solving a single projection subproblem in each iteration. We show that OE can achieve the optimal rate of convergence for solving a variety of VI problems in a much simpler way than existing approaches. We then introduce the stochastic operator extrapolation (SOE) method and establish its optimal convergence behavior for solving different stochastic VI problems. In particular, SOE achieves the optimal complexity for solving a fundamental problem, i.e., stochastic smooth and strongly monotone VI, for the first time in the literature. We also present a stochastic block operator extrapolations (SBOE) method to further reduce the iteration cost for the OE method applied to large-scale deterministic VIs with a certain block structure. Numerical experiments have been conducted to demonstrate the potential advantages of the proposed algorithms. In fact, all these algorithms are applied to solve generalized monotone variational inequality (GMVI) problems whose operator is not necessarily monotone. We will also discuss optimal OE-based policy evaluation methods for reinforcement learning in a companion paper.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > New York (0.04)
Model-Based Imitation Learning with Accelerated Convergence
Cheng, Ching-An, Yan, Xinyan, Theodorou, Evangelos, Boots, Byron
Sample efficiency is critical in solving real-world reinforcement learning problems, where agent-environment interactions can be costly. Imitation learning from expert advice has proved to be an effective strategy for reducing the number of interactions required to train a policy. Online imitation learning, a specific type of imitation learning that interleaves policy evaluation and policy optimization, is a particularly effective framework for training policies with provable performance guarantees. In this work, we seek to further accelerate the convergence rate of online imitation learning, making it more sample efficient. We propose two model-based algorithms inspired by Follow-the-Leader (FTL) with prediction: MoBIL-VI based on solving variational inequalities and MoBIL-Prox based on stochastic first-order updates. When a dynamics model is learned online, these algorithms can provably accelerate the best known convergence rate up to an order. Our algorithms can be viewed as a generalization of stochastic Mirror-Prox by Juditsky et al. (2011), and admit a simple constructive FTL-style analysis of performance. The algorithms are also empirically validated in simulation.