sparse logistic regression learn
Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is \ell_1 -constrained logistic regression, while for more general alphabets an \ell_{2,1} group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an \ell_1 (or \ell_{2,1}) constraint and the sample vector has an \ell_{\infty} (or \ell_{2, \infty}) constraint.
- Research Report > New Finding (0.89)
- Research Report > Experimental Study (0.89)
Reviews: Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
This paper gives a simple and elegant algorithm for solving the long-studied problem of graphical model estimation (at least, in the case of pairwise MRFs, which includes the classic Ising model). The method uses a form of constrained logistic regression, which in retrospect, feels like the "right" way to solve this problem. The algorithm simply runs this constrained logistic regression method to learn the outgoing edges attached to each node. The proof is elegant and modular: first, based on standard generalization bounds, a sufficient number of samples allows minimization of the logistic loss function. Second, this loss is related to another loss function (the sigmoid of the inner product of the parameter vector with a sample from the distribution).
- Research Report > New Finding (0.88)
- Research Report > Experimental Study (0.88)
- Research Report > New Finding (0.40)
- Research Report > Experimental Study (0.40)
Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is \ell_1 -constrained logistic regression, while for more general alphabets an \ell_{2,1} group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an \ell_1 (or \ell_{2,1}) constraint and the sample vector has an \ell_{\infty} (or \ell_{2, \infty}) constraint.
- Research Report > New Finding (0.89)
- Research Report > Experimental Study (0.89)
Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
Wu, Shanshan, Sanghavi, Sujay, Dimakis, Alexandros G.
We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an $\ell_{2,1}$ group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an $\ell_1$ (or $\ell_{2,1}$) constraint and the sample vector has an $\ell_{\infty}$ (or $\ell_{2, \infty}$) constraint.
- Research Report > New Finding (0.89)
- Research Report > Experimental Study (0.89)