AITopics | accelerated

Error Compensated Distributed SGD Can Be Accelerated

Neural Information Processing SystemsDec-25-2025, 08:42:10 GMT

Gradient compression is a recent and increasingly popular technique for reducing the communication cost in distributed training of large-scale machine learning models. In this work we focus on developing efficient distributed methods that can work for any compressor satisfying a certain contraction property, which includes both unbiased (after appropriate scaling) and biased compressors such as RandK and TopK. Applied naively, gradient compression introduces errors that either slow down convergence or lead to divergence. A popular technique designed to tackle this issue is error compensation/error feedback. Due to the difficulties associated with analyzing biased compressors, it is not known whether gradient compression with error compensation can be combined with acceleration. In this work, we show for the first time that error compensated gradient compression methods can be accelerated. In particular, we propose and study the error compensated loopless Katyusha method, and establish an accelerated linear convergence rate under standard assumptions. We show through numerical experiments that the proposed method converges with substantially fewer communication rounds than previous error compensated algorithms.

accelerated, error compensated, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems

Neural Information Processing SystemsDec-24-2025, 23:14:10 GMT

We consider the projected gradient algorithm for the nonconvex best subset selection problem that minimizes a given empirical loss function under an $\ell_0$-norm constraint. Through decomposing the feasible set of the given sparsity constraint as a finite union of linear subspaces, we present two acceleration schemes with global convergence guarantees, one by same-space extrapolation and the other by subspace identification. The former fully utilizes the problem structure to greatly accelerate the optimization speed with only negligible additional cost. The latter leads to a two-stage meta-algorithm that first uses classical projected gradient iterations to identify the correct subspace containing an optimal solution, and then switches to a highly-efficient smooth optimization method in the identified subspace to attain superlinear convergence. Experiments demonstrate that the proposed accelerated algorithms are magnitudes faster than their non-accelerated counterparts as well as the state of the art.

accelerated, gradient algorithm, sparsity constrained optimization problem, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Neural Information Processing SystemsDec-24-2025, 01:43:16 GMT

Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of multi-agent interactive behaviors to be trustworthy, behaviors which can be highly nuanced and complex. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simulation and testing. Waymax uses publicly-released, real-world driving data (e.g., the Waymo Open Motion Dataset) to initialize or play back a diverse set of multi-agent simulated scenarios. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training, making it suitable for modern large-scale, distributed machine learning workflows. To support online training and evaluation, Waymax includes several learned and hard-coded behavior models that allow for realistic interaction within simulation. To supplement Waymax, we benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions, where we highlight the effectiveness of routes as guidance for planning agents and the ability of RL to overfit against simulated agents.

data-driven simulator, large-scale autonomous driving research, waymax, (5 more...)

Neural Information Processing Systems

Industry:

Transportation > Ground > Road (0.66)
Information Technology > Robotics & Automation (0.66)
Automobiles & Trucks (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Error Compensated Distributed SGD Can Be Accelerated

Neural Information Processing SystemsJan-19-2025, 15:28:56 GMT

Gradient compression is a recent and increasingly popular technique for reducing the communication cost in distributed training of large-scale machine learning models. In this work we focus on developing efficient distributed methods that can work for any compressor satisfying a certain contraction property, which includes both unbiased (after appropriate scaling) and biased compressors such as RandK and TopK. Applied naively, gradient compression introduces errors that either slow down convergence or lead to divergence. A popular technique designed to tackle this issue is error compensation/error feedback. Due to the difficulties associated with analyzing biased compressors, it is not known whether gradient compression with error compensation can be combined with acceleration.

accelerated, error compensated, popular technique, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems

Neural Information Processing SystemsJan-18-2025, 12:27:16 GMT

We consider the projected gradient algorithm for the nonconvex best subset selection problem that minimizes a given empirical loss function under an \ell_0 -norm constraint. Through decomposing the feasible set of the given sparsity constraint as a finite union of linear subspaces, we present two acceleration schemes with global convergence guarantees, one by same-space extrapolation and the other by subspace identification. The former fully utilizes the problem structure to greatly accelerate the optimization speed with only negligible additional cost. The latter leads to a two-stage meta-algorithm that first uses classical projected gradient iterations to identify the correct subspace containing an optimal solution, and then switches to a highly-efficient smooth optimization method in the identified subspace to attain superlinear convergence. Experiments demonstrate that the proposed accelerated algorithms are magnitudes faster than their non-accelerated counterparts as well as the state of the art.

accelerated, gradient algorithm, sparsity constrained optimization problem, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Neural Information Processing SystemsOct-10-2024, 02:27:59 GMT

Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of multi-agent interactive behaviors to be trustworthy, behaviors which can be highly nuanced and complex. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simulation and testing. Waymax uses publicly-released, real-world driving data (e.g., the Waymo Open Motion Dataset) to initialize or play back a diverse set of multi-agent simulated scenarios. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training, making it suitable for modern large-scale, distributed machine learning workflows.

data-driven simulator, large-scale autonomous driving research, waymax, (2 more...)

Neural Information Processing Systems

Industry:

Transportation > Ground > Road (0.64)
Information Technology > Robotics & Automation (0.64)
Automobiles & Trucks (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.97)

Add feedback

An Accelerated Distributed Stochastic Gradient Method with Momentum

Huang, Kun, Pu, Shi, Nedić, Angelia

arXiv.org Artificial IntelligenceFeb-18-2024

In this paper, we introduce an accelerated distributed stochastic gradient method with momentum for solving the distributed optimization problem, where a group of $n$ agents collaboratively minimize the average of the local objective functions over a connected network. The method, termed ``Distributed Stochastic Momentum Tracking (DSMT)'', is a single-loop algorithm that utilizes the momentum tracking technique as well as the Loopless Chebyshev Acceleration (LCA) method. We show that DSMT can asymptotically achieve comparable convergence rates as centralized stochastic gradient descent (SGD) method under a general variance condition regarding the stochastic gradients. Moreover, the number of iterations (transient times) required for DSMT to achieve such rates behaves as $\mathcal{O}(n^{5/3}/(1-\lambda))$ for minimizing general smooth objective functions, and $\mathcal{O}(\sqrt{n/(1-\lambda)})$ under the Polyak-{\L}ojasiewicz (PL) condition. Here, the term $1-\lambda$ denotes the spectral gap of the mixing matrix related to the underlying network topology. Notably, the obtained results do not rely on multiple inter-node communications or stochastic gradient accumulation per iteration, and the transient times are the shortest under the setting to the best of our knowledge.

accelerated, momentum, stochastic gradient method

arXiv.org Artificial Intelligence

2402.09714

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Gulino, Cole, Fu, Justin, Luo, Wenjie, Tucker, George, Bronstein, Eli, Lu, Yiren, Harb, Jean, Pan, Xinlei, Wang, Yan, Chen, Xiangyu, Co-Reyes, John D., Agarwal, Rishabh, Roelofs, Rebecca, Lu, Yao, Montali, Nico, Mougin, Paul, Yang, Zoey, White, Brandyn, Faust, Aleksandra, McAllister, Rowan, Anguelov, Dragomir, Sapp, Benjamin

arXiv.org Artificial IntelligenceOct-12-2023

Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simulation and testing. Waymax uses publicly-released, real-world driving data (e.g., the Waymo Open Motion Dataset) to initialize or play back a diverse set of multi-agent simulated scenarios. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training, making it suitable for modern large-scale, distributed machine learning workflows. To support online training and evaluation, Waymax includes several learned and hard-coded behavior models that allow for realistic interaction within simulation. To supplement Waymax, we benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions, where we highlight the effectiveness of routes as guidance for planning agents and the ability of RL to overfit against simulated agents.

accelerated, data-driven simulator, large-scale autonomous driving research, (1 more...)

arXiv.org Artificial Intelligence

2310.0871

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.60)
Information Technology > Robotics & Automation (0.60)
Automobiles & Trucks (0.60)
Education > Educational Setting > Online (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.89)

Add feedback

Accelerated, Optimal, and Parallel: Some Results on Model-Based Stochastic Optimization

Chadha, Karan, Cheng, Gary, Duchi, John C.

arXiv.org Machine LearningJan-7-2021

We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving stochastic convex optimization problems, including stochastic subgradient, proximal point, and bundle methods, to the minibatch and accelerated setting. To do so, we propose specific model-based algorithms and an acceleration scheme for which we provide non-asymptotic convergence guarantees, which are order-optimal in all problem-dependent constants and provide linear speedup in minibatch size, while maintaining the desirable robustness traits (e.g. to stepsize) of the aProx family. Additionally, we show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems, whose importance in statistical machine learning is growing; this, for example, gives a parallelization strategy for alternating projections. We corroborate our theoretical results with empirical testing to demonstrate the gains accurate modeling, acceleration, and minibatching provide.

accelerated, dist, iteration, (15 more...)

arXiv.org Machine Learning

2101.02696

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Council Post: How AI And Covid-19 Have Accelerated The Decline Of Human Labor

#artificialintelligenceOct-28-2020, 02:15:04 GMT

Also a cross-disciplinary scientist, entrepreneur & author, recently relocated from Hong Kong to rural Seattle area. Eventually, Covid-19 will be beaten -- vaccines and therapies will be found and widely deployed. However, that doesn't mean the jobs that the pandemic has taken are coming back. Of course, some will return. For instance, restaurants will return to in-house dining and hire more waitstaff. But the rethinking and reorganization that Covid-19 has induced will have longer-term impacts.

artificial intelligence, covid-19, machine learning, (8 more...)

#artificialintelligence

Country:

Asia > China > Hong Kong (0.25)
North America > United States > Illinois > Cook County > Chicago (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Africa > Republic of the Congo (0.05)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.85)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.50)
Information Technology > Artificial Intelligence > Robots (0.31)

Add feedback

Filters

Collaborating Authors

accelerated

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Error Compensated Distributed SGD Can Be Accelerated

Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Error Compensated Distributed SGD Can Be Accelerated

Accelerated Projected Gradient Algorithms for Sparsity Constrained Optimization Problems

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

An Accelerated Distributed Stochastic Gradient Method with Momentum

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Accelerated, Optimal, and Parallel: Some Results on Model-Based Stochastic Optimization

Council Post: How AI And Covid-19 Have Accelerated The Decline Of Human Labor