Lopez-Paz, David
The Randomized Dependence Coefficient
Lopez-Paz, David, Hennig, Philipp, Schölkopf, Bernhard
We introduce the Randomized Dependence Coefficient (RDC), a measure of non-linear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient. RDC is defined in terms of correlation of random non-linear copula projections; it is invariant with respect to marginal distribution transformations, has low computational cost and is easy to implement: just five lines of R code, included at the end of the paper.
The Randomized Dependence Coefficient
Lopez-Paz, David, Hennig, Philipp, Schölkopf, Bernhard
We introduce the Randomized Dependence Coefficient (RDC), a measure of non-linear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-R\'enyi Maximum Correlation Coefficient. RDC is defined in terms of correlation of random non-linear copula projections; it is invariant with respect to marginal distribution transformations, has low computational cost and is easy to implement: just five lines of R code, included at the end of the paper.
Gaussian Process Vine Copulas for Multivariate Dependence
Lopez-Paz, David, Hernández-Lobato, José Miguel, Ghahramani, Zoubin
Copulas allow to learn marginal distributions separately from the multivariate dependence structure (copula) that links them together into a density function. Vine factorizations ease the learning of high-dimensional copulas by constructing a hierarchy of conditional bivariate copulas. However, to simplify inference, it is common to assume that each of these conditional bivariate copulas is independent from its conditioning variables. In this paper, we relax this assumption by discovering the latent functions that specify the shape of a conditional copula given its conditioning variables We learn these functions by following a Bayesian approach based on sparse Gaussian processes with expectation propagation for scalable, approximate inference. Experiments on real-world datasets show that, when modeling all conditional dependencies, we obtain better estimates of the underlying copula of the data.
Semi-Supervised Domain Adaptation with Non-Parametric Copulas
Lopez-Paz, David, Hernández-Lobato, José Miguel, Schölkopf, Bernhard
A new framework based on the theory of copulas is proposed to address semi- supervised domain adaptation problems. The presented method factorizes any multivariate density into a product of marginal distributions and bivariate cop- ula functions. Therefore, changes in each of these factors can be detected and corrected to adapt a density model accross different learning domains. Impor- tantly, we introduce a novel vine copula model, which allows for this factorization in a non-parametric manner. Experimental results on regression problems with real-world data illustrate the efficacy of the proposed approach when compared to state-of-the-art techniques.