dgf
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- North America > United States > Texas (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Texas (0.05)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > Canada (0.04)
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization
Zhang, Dinghuai, Chen, Ricky T. Q., Liu, Cheng-Hao, Courville, Aaron, Bengio, Yoshua
We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajectories to compute, resulting in sluggish credit assignment issues due to use of entire trajectories and a learning signal present only at the terminal time. In this work, we present Diffusion Generative Flow Samplers (DGFS), a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments, via parameterizing an additional "flow function". Our method takes inspiration from the theory developed for generative flow networks (GFlowNets), allowing us to make use of intermediate learning signals. Through various challenging experiments, we demonstrate that DGFS achieves more accurate estimates of the normalization constant than closely-related prior methods.
- North America > United States (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
- (2 more...)
- Research Report (0.64)
- Instructional Material (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria
Farina, Gabriele, Kroer, Christian, Sandholm, Tuomas
We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games. First-order methods must typically be instantiated with a regularizer that serves as a distance-generating function for the decision sets of the players. For the case of two-player zero-sum games, the state-of-the-art theoretical convergence rate for Nash equilibrium is achieved by using the dilated entropy function. In this paper, we introduce a new entropy-based distance-generating function for two-player zero-sum games, and show that this function achieves significantly better strong convexity properties than the dilated entropy, while maintaining the same easily-implemented closed-form proximal mapping. Extensive numerical simulations show that these superior theoretical properties translate into better numerical performance as well. We then generalize our new entropy distance function, as well as general dilated distance functions, to the scaled extension operator. The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria. By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria. Our methods have a guaranteed $1/T$ rate of convergence, along with linear-time proximal updates.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Texas (0.04)
- North America > United States > New York > Richmond County > New York City (0.04)
- (6 more...)
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
Kroer, Christian, Waugh, Kevin, Kilinc-Karzan, Fatma, Sandholm, Tuomas
Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate the acceleration of first-order methods for solving extensive-form games through better design of the dilated entropy function---a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player. This result improves the convergence rate of several first-order methods by a factor of $\Omega(b^dd)$, where $b$ is the branching factor of the player, and $d$ is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than first-order methods despite their theoretically inferior convergence rates. Using our new weighting scheme and practical tuning we show that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFR+, in practice.
- North America > Canada > Alberta (0.14)
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)
Constructing Conditional Plans by a Theorem-Prover
The research on conditional planning rejects the assumptions that there is no uncertainty or incompleteness of knowledge with respect to the state and changes of the system the plans operate on. Without these assumptions the sequences of operations that achieve the goals depend on the initial state and the outcomes of nondeterministic changes in the system. This setting raises the questions of how to represent the plans and how to perform plan search. The answers are quite different from those in the simpler classical framework. In this paper, we approach conditional planning from a new viewpoint that is motivated by the use of satisfiability algorithms in classical planning. Translating conditional planning to formulae in the propositional logic is not feasible because of inherent computational limitations. Instead, we translate conditional planning to quantified Boolean formulae. We discuss three formalizations of conditional planning as quantified Boolean formulae, and present experimental results obtained with a theorem-prover.
- North America > United States (0.14)
- Oceania > Nauru > Aiwo Constituency > Aiwo District (0.04)
- Asia > Middle East > UAE (0.04)