Learning from higher-order correlations, efficiently: hypothesis tests, random features, and neural networks

Neural Information Processing Systems 

Neural networks excel at discovering statistical patterns inhigh-dimensional data sets. In practice, higher-order cumulants, which quantifythe non-Gaussian correlations between three or more variables, are particularlyimportant for the performance of neural networks. But how efficient are neuralnetworks at extracting features from higher-order cumulants? We study thisquestion in the spiked cumulant model, where the statistician needs to recover aprivileged direction or "spike'' from the order- p\ge 4 cumulantsof d -dimensional inputs. We first discuss the fundamental statistical andcomputational limits of recovering the spike by analysing the number of samples n required to strongly distinguish between inputs from the spikedcumulant model and isotropic Gaussian inputs.