Decentralized Proximal Stochastic Gradient Langevin Dynamics

Islam, Mohammad Rafiqul, Zhu, Lingjiong

arXiv.org Machine Learning 

Decentralized learning is a learning process in which data is distributed across computational agents or collected by individual agents, and model parameters are computed as the consensus of the agents. It has gained a lot of interest for applications where agents can collaboratively learn a predictive model without sharing their own data, but sharing only their local models with their immediate neighbors to generate a global model [He et al., 2018, Hendrikx et al., 2019, Arjevani et al., 2020]. We assume there are N agents who are connected over an undirected communication network G = (V,E) where V = {1,...,N} represents the agents and E V V denotes the set of edges; i.e., if agent i and j are connected then (i,j) E implies (j,i) E. Suppose we have a collection of n independent and identically distributed (i.i.d.) data pairs zi = (ai,yi), where ai Rp is the feature vector and yi the label or response of the i-th observation. Let Z = [z1,z2,,zn] Rnp be sampled from the distribution p(Z|x) where the parameter x Rd has a common prior. The goal is to sample from the posterior distribution p(x|Z) p(Z|x)p(x) by distributing Z among N agents such that Zi = {zi1,zi2,,zini} is the subset of data exclusive to agent i.