Near Optimal Sketching of Low-Rank Tensor Regression

Li, Xingguo, Haupt, Jarvis, Woodruff, David

Neural Information Processing Systems 

This problem is motivated by the fact that the number of parameters in $\Theta$ is only $R \cdot \sum_{d 1} D p_D$, which is significantly smaller than the $\prod_{d 1} {D} p_d$ number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors $\Theta$, as well as the Tucker decomposition. We obtain a significantly smaller dimension and sparsity in the randomized linear mapping $\Phi$ than is possible for ordinary least squares regression. Finally, we give a number of numerical simulations supporting our theory. Papers published at the Neural Information Processing Systems Conference.