Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector Machine

Chen, Canyi, Qiao, Nan, Zhu, Liping

arXiv.org Machine Learning 

Massive datasets, characterized by both large sample sizes and high-dimensional features, are increasingly prevalent across diverse fields. For example, the 1000 Genomes Project Consortium et al. (2015) study amassed genomic data from 2,504 individuals spanning 26 populations, yielding approximately 12 terabytes data. Often, such datasets are distributed across multiple locations. Fusing data together for centralized statistical analysis is somehow infeasible due to concerns over data privacy, memory and storage limitations, and bandwidth constraints. The absence of fusion centers has thus fueled interest in decentralized distributed learning--a paradigm that fully exploits distributed datasets by performing computations locally. This methodology has found successful applications in fields such as personalized medicine, edge computing, smart utilities, and dimension reduction (Li et al., 2011). A fundamental task in these applications is classification. Penalized support vector machines (SVMs) have been enduringly powerful tools for high-dimensional classification tasks, building on the seminal contributions of Boser et al. (1992) and Vapnik (2000). The standard objective function for penalized SVMs combines the hinge loss with a penalty term.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found