Pivotal CLTs for Pseudolikelihood via Conditional Centering in Dependent Random Fields

Deb, Nabarun

arXiv.org Artificial Intelligence 

Data from such models often exhibits significant deviations from classical Gaussian approximations. A natural class of statistics to analyze in such models are conditionally centered averages (see [30, 63, 52]), where one recenters the observations by their mean, given all other observations. Crucially, such conditionally centered CLTs are closely tied to maximum pseudolikelihood estimators (MPLEs) through the MPLE score (see [64, 60, 41]). This connection is practically important because in many graphical/Markov random field models (such as Ising models, exponential random graph models (ERGMs), etc.), computing the MLE is impeded by an intractable normalizing constant, whereas pseudolikelihood replaces the joint likelihood with a product of tractable conditional models, scales to large networks, and is widely usable in practice. However, most existing theory for conditionally centered statistics and for MPLE focuses on local dependence -- e.g., bounded degree or sparse neighborhoods -- and does not cover realistic dense regimes in which every node may have many connections (which scale with the size of the network). This paper bridges that gap by developing a general limit theory for conditionally centered statistics under weak and verifiable assumptions. Our results accommodate both sparse and dense interactions, as well as regular and irregular network connections. In particular, we deliver valid studentized inference for pseudolikelihood in network/Markov random field settings. As examples, we obtain new CLTs for conditionally centered averages and pseudo-likelihood estimators in Ising models (with pairwise and tensor interactions), and exponential random graph models, without imposing sparsity, regularity, or high temperature restrictions.