AITopics | pdmm

We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

admm, convergence, pdmm, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Parallel Direction Method of Multipliers

Neural Information Processing SystemsSep-30-2025, 09:27:45 GMT

We consider the problem of minimizing block-separable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve the optimization problems with multi-block linear constraints. PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

multiplier, name change, parallel direction method, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Byzantine-Resilient Federated Learning via Distributed Optimization

Xia, Yufei, Yu, Wenrui, Li, Qiongxiu

arXiv.org Artificial IntelligenceMar-13-2025

Byzantine attacks present a critical challenge to Federated Learning (FL), where malicious participants can disrupt the training process, degrade model accuracy, and compromise system reliability. Traditional FL frameworks typically rely on aggregation-based protocols for model updates, leaving them vulnerable to sophisticated adversarial strategies. In this paper, we demonstrate that distributed optimization offers a principled and robust alternative to aggregation-centric methods. Specifically, we show that the Primal-Dual Method of Multipliers (PDMM) inherently mitigates Byzantine impacts by leveraging its fault-tolerant consensus mechanism. Through extensive experiments on three datasets (MNIST, FashionMNIST, and Olivetti), under various attack scenarios including bit-flipping and Gaussian noise injection, we validate the superior resilience of distributed optimization protocols. Compared to traditional aggregation-centric approaches, PDMM achieves higher model utility, faster convergence, and improved stability. Our results highlight the effectiveness of distributed optimization in defending against Byzantine threats, paving the way for more secure and resilient federated learning systems.

byzantine attack, pdmm, protocol, (17 more...)

arXiv.org Artificial Intelligence

2503.10792

Country:

Europe > Denmark > North Jutland > Aalborg (0.05)
Europe > France (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Parallel Direction Method of Multipliers

Huahua Wang, Arindam Banerjee, Zhi-Quan Luo

Neural Information Processing SystemsFeb-9-2025, 01:21:38 GMT

We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

artificial intelligence, machine learning, pdmm, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

AttentionX: Exploiting Consensus Discrepancy In Attention from A Distributed Optimization Perspective

Zhang, Guoqiang, Heusdens, Richard

arXiv.org Artificial IntelligenceSep-9-2024

In this paper, we extend the standard Attention in transformer by exploiting the consensus discrepancy from a distributed optimization perspective, referred to as AttentionX. It is noted that the primal-dual method of multipliers (PDMM) \cite{Zhang16PDMM} is designed to iteratively solve a broad class of distributed optimization problems over a pear-to-pear (P2P) network, where neighbouring nodes gradually reach consensus as specified by predefined linear edge-constraints in the optimization process. In particular, at each iteration of PDMM, each node in a network first performs information-gathering from neighbours and then performs local information-fusion. From a high-level point of view, the $KQ$-softmax-based weighted summation of $V$-representations in Attention corresponds information-gathering from neighbours while the feature-processing via the feed-forward network (FFN) in transformer corresponds to local information fusion. PDMM exploits the Lagrangian multipliers to capture the historical consensus discrepancy in the form of residual errors of the linear edge-constraints, which plays a crucial role for the algorithm to converge. Inspired by PDMM, we propose AttentionX to incorporate the consensus discrepancy in the output update-expression of the standard Attention. The consensus discrepancy in AttentionX refers to the difference between the weighted summation of $V$-representations and scaled $V$-representions themselves. Experiments on ViT and nanoGPT show promising performance.

attentionx, consensus discrepancy, discrepancy, (14 more...)

arXiv.org Artificial Intelligence

2409.04275

Country:

Europe > Netherlands (0.04)
Europe > United Kingdom > England > Devon > Exeter (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Parallel Direction Method of Multipliers

Neural Information Processing SystemsMar-13-2024, 08:01:25 GMT

We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

admm, convergence, pdmm, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Parallel Direction Method of Multipliers

Wang, Huahua, Banerjee, Arindam, Luo, Zhi-Quan

Neural Information Processing SystemsFeb-14-2020, 05:11:09 GMT

We consider the problem of minimizing block-separable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve the optimization problems with multi-block linear constraints. PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size.

linear constraint, multiplier, parallel direction method, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Parallel Direction Method of Multipliers

Wang, Huahua, Banerjee, Arindam, Luo, Zhi-Quan

Neural Information Processing SystemsDec-31-2014

We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

artificial intelligence, machine learning, pdmm, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.68)

Technology: