stepsize parameter
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- (2 more...)
Adaptive Primal-Dual Splitting Methods for Statistical Learning and Image Processing
The alternating direction method of multipliers (ADMM) is an important tool for solving complex optimization problems, but it involves minimization sub-steps that are often difficult to solve efficiently. The Primal-Dual Hybrid Gradient (PDHG) method is a powerful alternative that often has simpler sub-steps than ADMM, thus producing lower complexity solvers. Despite the flexibility of this method, PDHG is often impractical because it requires the careful choice of multiple stepsize parameters. There is often no intuitive way to choose these parameters to maximize efficiency, or even achieve convergence. We propose self-adaptive stepsize rules that automatically tune PDHG parameters for optimal convergence. We rigorously analyze our methods, and identify convergence rates. Numerical experiments show that adaptive PDHG has strong advantages over non-adaptive methods in terms of both efficiency and simplicity for the user.
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- (2 more...)
Efficient first-order algorithms for large-scale, non-smooth maximum entropy models with application to wildfire science
Langlois, Gabriel P., Buch, Jatan, Darbon, Jérôme
Maximum entropy (Maxent) models are a class of statistical models that use the maximum entropy principle to estimate probability distributions from data. Due to the size of modern data sets, Maxent models need efficient optimization algorithms to scale well for big data applications. State-of-the-art algorithms for Maxent models, however, were not originally designed to handle big data sets; these algorithms either rely on technical devices that may yield unreliable numerical results, scale poorly, or require smoothness assumptions that many practical Maxent models lack. In this paper, we present novel optimization algorithms that overcome the shortcomings of state-of-the-art algorithms for training large-scale, non-smooth Maxent models. Our proposed first-order algorithms leverage the Kullback-Leibler divergence to train large-scale and non-smooth Maxent models efficiently. For Maxent models with discrete probability distribution of $n$ elements built from samples, each containing $m$ features, the stepsize parameters estimation and iterations in our algorithms scale on the order of $O(mn)$ operations and can be trivially parallelized. Moreover, the strong $\ell_{1}$ convexity of the Kullback--Leibler divergence allows for larger stepsize parameters, thereby speeding up the convergence rate of our algorithms. To illustrate the efficiency of our novel algorithms, we consider the problem of estimating probabilities of fire occurrences as a function of ecological features in the Western US MTBS-Interagency wildfire data set. Our numerical results show that our algorithms outperform the state of the arts by one order of magnitude and yield results that agree with physical models of wildfire occurrence and previous statistical analyses of wildfire drivers.
- North America > United States > New York > New York County > New York City (0.28)
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- (11 more...)
- Government > Regional Government > North America Government > United States Government (0.46)
- Education (0.46)
Adaptive Stochastic Optimization
Curtis, Frank E., Scheinberg, Katya
Optimization lies at the heart of machine learning and signal processing. Contemporary approaches based on the stochastic gradient method are non-adaptive in the sense that their implementation employs prescribed parameter values that need to be tuned for each application. This article summarizes recent research and motivates future work on adaptive stochastic optimization methods, which have the potential to offer significant computational savings when training large-scale systems.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Asia > Middle East > Jordan (0.04)
Adaptive Primal-Dual Splitting Methods for Statistical Learning and Image Processing
Goldstein, Tom, Li, Min, Yuan, Xiaoming
The alternating direction method of multipliers (ADMM) is an important tool for solving complex optimization problems, but it involves minimization sub-steps that are often difficult to solve efficiently. The Primal-Dual Hybrid Gradient (PDHG) method is a powerful alternative that often has simpler substeps than ADMM, thus producing lower complexity solvers. Despite the flexibility of this method, PDHG is often impractical because it requires the careful choice of multiple stepsize parameters. There is often no intuitive way to choose these parameters to maximize efficiency, or even achieve convergence. We propose self-adaptive stepsize rules that automatically tune PDHG parameters for optimal convergence. We rigorously analyze our methods, and identify convergence rates. Numerical experiments show that adaptive PDHG has strong advantages over non-adaptive methods in terms of both efficiency and simplicity for the user.
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- (2 more...)