A computational transition for detecting multivariate shuffled linear regression by low-degree polynomials
In this paper, we study the problem of multivariate shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we investigate the model $Y=\tfrac{1}{\sqrt{1+\sigma^2}}(\Pi_* X Q_* + \sigma Z)$, where $X$ is an $n*d$ standard Gaussian design matrix, $Z$ is an $n*m$ Gaussian noise matrix, $\Pi_*$ is an unknown $n*n$ permutation matrix, and $Q_*$ is an unknown $d*m$ on the Grassmanian manifold satisfying $Q_*^{\top} Q_* = \mathbb I_m$. Consider the hypothesis testing problem of distinguishing this model from the case where $X$ and $Y$ are independent Gaussian random matrices of sizes $n*d$ and $n*m$, respectively. Our results reveal a phase transition phenomenon in the performance of low-degree polynomial algorithms for this task. (1) When $m=o(d)$, we show that all degree-$D$ polynomials fail to distinguish these two models even when $\sigma=0$, provided with $D^4=o\big( \tfrac{d}{m} \big)$. (2) When $m=d$ and $\sigma=\omega(1)$, we show that all degree-$D$ polynomials fail to distinguish these two models provided with $D=o(\sigma)$. (3) When $m=d$ and $\sigma=o(1)$, we show that there exists a constant-degree polynomial that strongly distinguish these two models. These results establish a smooth transition in the effectiveness of low-degree polynomial algorithms for this problem, highlighting the interplay between the dimensions $m$ and $d$, the noise level $\sigma$, and the computational complexity of the testing task.
Apr-3-2025