Goto

Collaborating Authors

 nonidentifiability


Challenges in interpretability of additive models

arXiv.org Machine Learning

We review generalized additive models as a type of ``transparent'' model that has recently seen renewed interest in the deep learning community as neural additive models. We highlight multiple types of nonidentifiability in this model class and discuss challenges in interpretability, arguing for restraint when claiming ``interpretability'' or ``suitability for safety-critical applications'' of such models.


On Two Distinct Sources of Nonidentifiability in Latent Position Random Graph Models

arXiv.org Machine Learning

The statistical analysis of network data is important for fields such as neuroscience (Vogelstein et al., 2012), sociology (Hoff et al., 2002), and physics (Newman and Girvan, 2004; Bickel and Chen, 2009). Recently, network data have become ubiquitous in the the modern data-science landscape, and a large literature on statistical methods for analyzing these data has developed. Popular statistical models for conditionally independent random graphs include, but are not limited to, the stochastic block model (Holland et al., 1983), the random dot product graph (Young and Scheinerman, 2007; Athreya et al., 2017), and graphons (Lovรกsz, 2012; Diaconis and Janson, 2007). Both the stochastic block model and the random dot product graph are examples of latent position random graphs (Hoff et al., 2002), a graph model that is motivated by the idea that individual nodes have latent positions whose values determine their propensity to form connections. The purpose of this manuscript is to explain a curious phenomenon that arises in latent position random graph settings.


Observational nonidentifiability, generalized likelihood and free energy

arXiv.org Machine Learning

We study the parameter estimation problem in mixture models with observational nonidentifiability: the full model (also containing hidden variables) is identifiable, but the marginal (observed) model is not. Hence global maxima of the marginal likelihood are (infinitely) degenerate and predictions of the marginal likelihood are not unique. We show how to generalize the marginal likelihood by introducing an effective temperature, and making it similar to the free energy. This generalization resolves the observational nonidentifiability, since its maximization leads to unique results that are better than a random selection of one degenerate maximum of the marginal likelihood or the averaging over many such maxima. The generalized likelihood inherits many features from the usual likelihood, e.g. it holds the conditionality principle, and its local maximum can be searched for via suitably modified expectation-maximization method. The maximization of the generalized likelihood relates to entropy optimization.


Discussion: Latent variable graphical model selection via convex optimization

arXiv.org Machine Learning

It is my pleasure to congratulate the authors for an innovative and inspiring piece of work. Chandrasekaran, Parrilo and Willsky (hereafter CPW) have come up with a novel approach, combining ideas from convex optimization and algebraic geometry, to the longstanding problem of Gaussian graphical model selection with latent variables. Their method is intuitive and simple to implement, based on solving a convex log-determinant program with suitable choices of regularization. In addition, they establish a number of attractive theoretical guarantees that hold under highdimensional scaling, meaning that the graph size p and sample size n are allowed to grow simultaneously.