xki
Core-Halo Decomposition: Decentralizing Large-Scale Fixed-Point Problems
Haixiang, null, Xu, Yang, Zhang, Jiefu, Wu, Xudong, Zhou, Zihan, He, Jun, Chen, Jiayu
We study solving large-scale fixed-point equation x = F(x) with decomposition. Standard strict decomposition assigns each agent a disjoint block and evaluates updates using only owned coordinates. For most operators, however, a block update may depend on variables outside the block. Truncating these dependencies by strict decomposition changes the mean operator and creates structural bias that cannot be removed by more samples, smaller stepsizes, or additional consensus. We therefore propose Core-Halo decomposition, which separates write ownership from read-only evaluation context: each agent updates its own core and reads from an overlapping halo. By aligning the Core-Halo decomposition with the blockdependence structure of F, the original fixed-point problem can be implemented faithfully in a decentralized multi-agent system. We further characterize the fundamental obstruction faced by strict decomposition through a Bellman closure condition and a blockwise bias lower bound, showing that local-only updates can alter the original fixed-point operator. Finally, we conduct extensive experiments across a range of application settings, and demonstrate that Core-Halo achieves near-centralized performance while retaining the parallelism benefits of decentralization.
327af0f71f7acdfd882774225f04775f-Supplemental.pdf
We will now derive continuous dynamics (2) in the main paper. Let 1m = 1 if class 1 is selected at iteration mand 1m = 0 otherwise. Likewise, we can obtain the dynamics of X2j similarly. We will next prove the separation theorem in binary classification, Theorem 2.1. Given the feature vectors X1i(t), X2j(t) for i,j [n], as t and large n, 1. if ฮฑ > ฮฒ, they are asymptotically separable with probability tending to one, 2. if ฮฑ ฮฒ, they are asymptotically separable with probability tending to zero. This also aligns with our intuition that the intra-class effect should be stronger than its inter-class counterpart. On the other hand, when ฮฑ>ฮฒ, ignoring a null set we may assume c1 >c2 without loss of generality.
ReSync: Riemannian Subgradient-based Robust Rotation Synchronization
This work presents ReSync, a Riemannian subgradient-based algorithm for solving the robust rotation synchronization problem, which arises in various engineering applications. ReSync solves a least-unsquared minimization formulation over the rotation group, which is nonsmooth and nonconvex, and aims at recovering the underlying rotations directly. We provide strong theoretical guarantees for ReSync under the random corruption setting. Specifically, we first show that the initialization procedure of ReSyncyields a proper initial point that lies in a local region around the ground-truth rotations. We next establish the weak sharpness property of the aforementioned formulation and then utilize this property to derive the local linear convergence of ReSyncto the ground-truth rotations. By combining these guarantees, we conclude that ReSync converges linearly to the ground-truth rotations under appropriate conditions. Experiment results demonstrate the effectiveness of ReSync.
Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization
Tyler B. Johnson, Carlos Guestrin
We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times. By considering a novel problem formulation--the minimization of a sum of piecewise functions--we describe a principled and general mechanism for exploiting piecewise linear structure in convex optimization. This result leads to a theoretically justified working set algorithm and a novel screening test, which generalize and improve upon many prior results on exploiting structure in convex optimization. In empirical comparisons, we study the scalability of our methods. We find that screening scales surprisingly poorly with the size of the problem, while our working set algorithm convincingly outperforms alternative approaches.
FedDR-RandomizedDouglas-RachfordSplittingAlgorithms forNonconvexFederatedCompositeOptimization ATheAnalysisofAlgorithm1: RandomizedCoordinateVariant--FedDR
FedAvg: FedAvg [29] has become a de facto standard federated learning algorithm in practice. However,it has several limitations as discussed in many papers, including [23]. It is also difficult to analyze convergence of FedAvg, especially in the nonconvex case andheterogeneity settings (both statistical andsystem heterogeneity). Moreover,FedAvg originally specifies SGD with a fixed number of epochs and a fixed learning rate as its local solver,making itlessflexible inpractice.