vqSGD: Vector Quantized Stochastic Gradient Descent
Gandikota, Venkata, Maity, Raj Kumar, Mazumdar, Arya
For any c R d, and r 0, let B d(c, r) denote a d -dimensional null 2 ball of radius r centered at c . Let e i R d denote the i -th standard basis vector which has 1 in the i -th position and 0 everywhere else. Also, let 1 d and 0 d denote the all 1's vector and all 0's vector in R d respectively. By [n ] we denote the set { 1, 2,..., n } . For a discrete set of points C R d, let conv (C) denote the convex hull of points in C, i.e.,, conv(C): null null c C a cc a c 0, null c C a c 1 null . Suppose w R d be the parameters of a function to be learned (such as weights of a neural network). In each step of the SGD algorithm, the parameters are updated as w w η ˆ д, where η is a possibly time-varying learning rate and ˆ д is a stochastic unbiased estimate of д, the true gradient of some loss function with respect to w .
Nov-18-2019
- Country:
- North America > United States
- Massachusetts > Hampshire County > Amherst (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: