interpolation condition
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (5 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- Europe > Russia (0.04)
- (3 more...)
Interpolation Conditions for Data Consistency and Prediction in Noisy Linear Systems
Vanelli, Martina, Monshizadeh, Nima, Hendrickx, Julien M.
Abstract-- We develop an interpolation-based framework for noisy linear systems with unknown system matrix with bounde d norm (implying bounded growth or non-increasing energy), and bounded process noise energy. The proposed approach characterizes all trajectories consistent with the measur ed data and these prior bounds in a purely data-driven manner . This characterization enables data-consistency verification, inference, and one-step-ahead prediction, which can be leverage d for safety verification and cost minimization. Ultimately, thi s work represents a preliminary step toward exploiting interpola tion conditions in data-driven control, offering a systematic w ay to characterize trajectories consistent with a dynamical sys tem within a given class and enabling their use in control design . Data-driven control has become a crucial aspect of modern control theory, offering powerful tools for system analysis and design [1].
- Europe > Netherlands (0.04)
- Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (6 more...)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > New York (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- Europe > Russia (0.04)
- (3 more...)
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Jiang, Xiaowen, Stich, Sebastian U.
The recently proposed stochastic Polyak stepsize (SPS) and stochastic line-search (SLS) for SGD have shown remarkable effectiveness when training over-parameterized models. However, in non-interpolation settings, both algorithms only guarantee convergence to a neighborhood of a solution which may result in a worse output than the initial guess. While artificially decreasing the adaptive stepsize has been proposed to address this issue (Orvieto et al. [2022]), this approach results in slower convergence rates for convex and over-parameterized models. In this work, we make two contributions: Firstly, we propose two new variants of SPS and SLS, called AdaSPS and AdaSLS, which guarantee convergence in non-interpolation settings and maintain sub-linear and linear convergence rates for convex and strongly convex functions when training over-parameterized models. AdaSLS requires no knowledge of problem-dependent parameters, and AdaSPS requires only a lower bound of the optimal function value as input. Secondly, we equip AdaSPS and AdaSLS with a novel variance reduction technique and obtain algorithms that require $\smash{\widetilde{\mathcal{O}}}(n+1/\epsilon)$ gradient evaluations to achieve an $\mathcal{O}(\epsilon)$-suboptimality for convex functions, which improves upon the slower $\mathcal{O}(1/\epsilon^2)$ rates of AdaSPS and AdaSLS without variance reduction in the non-interpolation regimes. Moreover, our result matches the fast rates of AdaSVRG but removes the inner-outer-loop structure, which is easier to implement and analyze. Finally, numerical experiments on synthetic and real datasets validate our theory and demonstrate the effectiveness and robustness of our algorithms.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- North America > United States > Virginia (0.04)
- (5 more...)
Adaptive Step-Size Methods for Compressed SGD
Subramaniam, Adarsh M., Magesh, Akshayaa, Veeravalli, Venugopal V.
Compressed Stochastic Gradient Descent (SGD) algorithms have been recently proposed to address the communication bottleneck in distributed and decentralized optimization problems, such as those that arise in federated machine learning. Existing compressed SGD algorithms assume the use of non-adaptive step-sizes(constant or diminishing) to provide theoretical convergence guarantees. Typically, the step-sizes are fine-tuned in practice to the dataset and the learning algorithm to provide good empirical performance. Such fine-tuning might be impractical in many learning scenarios, and it is therefore of interest to study compressed SGD using adaptive step-sizes. Motivated by prior work on adaptive step-size methods for SGD to train neural networks efficiently in the uncompressed setting, we develop an adaptive step-size method for compressed SGD. In particular, we introduce a scaling technique for the descent step in compressed SGD, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition and for non-convex objectives under a strong growth condition. We also show through simulation examples that without this scaling, the algorithm can fail to converge. We present experimental results on deep neural networks for real-world datasets, and compare the performance of our proposed algorithm with previously proposed compressed SGD methods in literature, and demonstrate improved performance on ResNet-18, ResNet-34 and DenseNet architectures for CIFAR-100 and CIFAR-10 datasets at various levels of compression.
- North America > United States > Illinois (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums
Zhou, Kaiwen, Tian, Lai, So, Anthony Man-Cho, Cheng, James
The problem of finding near-stationary points in convex optimization has not been adequately studied yet, unlike other optimality measures such as minimizing function value. Even in the deterministic case, the optimal method (OGM-G, due to Kim and Fessler (2021)) has just been discovered recently. In this work, we conduct a systematic study of the algorithmic techniques in finding near-stationary points of convex finite-sums. Our main contributions are several algorithmic discoveries: (1) we discover a memory-saving variant of OGM-G based on the performance estimation problem approach (Drori and Teboulle, 2014); (2) we design a new accelerated SVRG variant that can simultaneously achieve fast rates for both minimizing gradient norm and function value; (3) we propose an adaptively regularized accelerated SVRG variant, which does not require the knowledge of some unknown initial constants and achieves near-optimal complexities. We put an emphasis on the simplicity and practicality of the new schemes, which could facilitate future developments.