Learning Distributions Generated by One-Layer ReLU Networks
Wu, Shanshan, Dimakis, Alexandros G., Sanghavi, Sujay
–Neural Information Processing Systems
We consider the problem of estimating the parameters of a $d$-dimensional rectified Gaussian distribution from i.i.d. A rectified Gaussian distribution is defined by passing a standard Gaussian distribution through a one-layer ReLU neural network. We give a simple algorithm to estimate the parameters (i.e., the weight matrix and bias vector of the ReLU neural network) up to an error $\eps orm{W}_F$ using $\widetilde{O}(1/\eps 2)$ samples and $\widetilde{O}(d 2/\eps 2)$ time (log factors are ignored for simplicity). This implies that we can estimate the distribution up to $\eps$ in total variation distance using $\widetilde{O}(\kappa 2d 2/\eps 2)$ samples, where $\kappa$ is the condition number of the covariance matrix. Our only assumption is that the bias vector is non-negative.
Neural Information Processing Systems
Mar-18-2020, 23:48:11 GMT
- Technology: