Fast leave-one-cluster-out cross-validation by clustered Network Information Criteria (NICc)
Qiu, Jiaxing, Lake, Douglas E., Henry, Teague R.
This paper introduced a clustered estimator of the Network Information Criterion (NICc) to approximate leave-one-cluster-out cross-validated deviance, which can be used as an alternative to cluster-based cross-validation when modeling clustered data. Stone proved that Akaike Information Criterion (AIC) is an asymptotic equivalence to leave-one-observation-out cross-validation if the parametric model is true. Ripley pointed out that the Network Information Criterion (NIC) derived in Stone's proof, is a better approximation to leave-one-observation-out cross-validation when the model is not true. For clustered data, we derived a clustered estimator of NIC, referred to as NICc, by substituting the Fisher information matrix in NIC with its estimator that adjusts for clustering. This adjustment imposes a larger penalty in NICc than the unclustered estimator of NIC when modeling clustered data, thereby preventing overfitting more effectively. In a simulation study and an empirical example, we used linear and logistic regression to model clustered data with Gaussian or binomial response, respectively. We showed that NICc is a better approximation to leave-one-cluster-out deviance and prevents overfitting more effectively than AIC and Bayesian Information Criterion (BIC). NICc leads to more accurate model selection, as determined by cluster-based cross-validation, compared to AIC and BIC.
May-30-2024
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe
- Netherlands > South Holland
- Dordrecht (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Netherlands > South Holland
- North America
- Canada (0.04)
- Saint Lucia > Dennery
- Dennery (0.04)
- United States
- California > Alameda County
- Berkeley (0.04)
- Virginia > Albemarle County
- Charlottesville (0.14)
- California > Alameda County
- South America > Uruguay
- Asia > Japan
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Technology: