vqSGD: Vector Quantized Stochastic Gradient Descent

Gandikota, Venkata, Maity, Raj Kumar, Mazumdar, Arya

Nov-18-2019–arXiv.org Machine Learning

For any c R d, and r 0, let B d(c, r) denote a d -dimensional null 2 ball of radius r centered at c . Let e i R d denote the i -th standard basis vector which has 1 in the i -th position and 0 everywhere else. Also, let 1 d and 0 d denote the all 1's vector and all 0's vector in R d respectively. By [n ] we denote the set { 1, 2,..., n } . For a discrete set of points C R d, let conv (C) denote the convex hull of points in C, i.e.,, conv(C): null null c C a cc a c 0, null c C a c 1 null . Suppose w R d be the parameters of a function to be learned (such as weights of a neural network). In each step of the SGD algorithm, the parameters are updated as w w η ˆ д, where η is a possibly time-varying learning rate and ˆ д is a stochastic unbiased estimate of д, the true gradient of some loss function with respect to w .

communication, gradient, quantization scheme, (15 more...)

arXiv.org Machine Learning

Nov-18-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Hampshire County > Amherst (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found