Goto

Collaborating Authors

 power criterion




Testing Goodness of Fit of Conditional Density Models with Kernels

Jitkrittum, Wittawat, Kanagawa, Heishiro, Schölkopf, Bernhard

arXiv.org Machine Learning

Conditional distributions provide a versatile tool for capturing the relationship between a target variable and a conditioning variable (or covariate). The last few decades has seen a broad range of modeling applications across multiple disciplines including econometrics in particular [30, 42], machine learning [14, 40], among others. In many cases, estimating a conditional density function from the observed data is a one of the first crucial steps in the data analysis pipeline. While the task of conditional density estimation has received a considerable attention in the literature, fewer works have investigated the equally important task of evaluating the goodness of fit of a given conditional density model. Several approaches that address the task of conditional model evaluation take the form of a hypothesis test. Given a conditional model, and a joint sample containing realizations of both target variables and covariates, test the null hypothesis stating that the model is correctly specified, against the alternative stating that it is not. The model does not specify the marginal distribution of the covariates. We refer to this task as conditional goodness-of-fit testing. One of the early nonparametric tests is [1], which extended the classic Kolmogorov test to the conditional case.


Informative Features for Model Comparison

Jitkrittum, Wittawat, Kanagawa, Heishiro, Sangkloy, Patsorn, Hays, James, Schölkopf, Bernhard, Gretton, Arthur

Neural Information Processing Systems

Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.


Informative Features for Model Comparison

Jitkrittum, Wittawat, Kanagawa, Heishiro, Sangkloy, Patsorn, Hays, James, Schölkopf, Bernhard, Gretton, Arthur

arXiv.org Machine Learning

Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.