Sharp analysis of EM for learning mixtures of pairwise differences
Dhawan, Abhishek, Mao, Cheng, Pananjady, Ashwin
–arXiv.org Artificial Intelligence
A common approach is to apply a spectral algorithm to initialize the parameters of interest, and then run a locally convergent nonconvex optimization algorithm; alternatively, one could even run the nonconvex algorithm from a random initialization. These nonconvex optimization algorithms have been extensively analyzed under Gaussian assumptions on the covariates (see, e.g. Balakrishnan et al. (2017); Klusowski et al. (2019); Chen et al. (2019); Kwon and Caramanis (2020)), and it is known that they can exhibit favorable behavior globally in some settings. In many tasks such as ranking (Bradley and Terry, 1952), crowd-labeling (Dawid and Skene, 1979) and web search (Chen et al., 2013), however, measurements are taken as pairwise comparisons between entities, which renders the covariates inherently discrete. These designs or covariates are far from Gaussian, and it is natural to ask how standard nonconvex optimization algorithms now perform. Motivated by this question, we study how the celebrated expectation-maximization (EM) algorithm behaves when estimating a mixture of symmetric linear regressions under the pairwise comparison design.
arXiv.org Artificial Intelligence
Jun-22-2023