Goto

Collaborating Authors

 causal model








A Algorithm

Neural Information Processing Systems

The proposed implementation of Gunsilius' algorithm computes For example, in the expenditure dataset (see Section I.3), In Figure 4, we show the results of Gunsilius's algorithm for three different Note that this algorithm works on the empirical CDFs of all variables, i.e., they are all scaled to lie Figure 4: We show results of Gunsilius's algorithm for 3 different settings of The practical issue of course is the optimization. That alone is already very computationally demanding and has convergence problems. A practical resource, sample size, limits the representational size of the estimator. How to achieve "enough variability" without aiming at a completely flexible distribution of In any case, the finite mixture of Gaussians approach can still be implemented with the reparameter-ization trick. The relation to Gunsilius algorithm is that our "base measure" is smoothly adaptive, leading to possibly more stable behavior in practice.



Learning Discrete Concepts in Latent Hierarchical Models

Neural Information Processing Systems

Learning concepts from natural high-dimensional data (e.g., images) holds potential in building human-aligned and interpretable machine learning models. Despite its encouraging prospect, formalization and theoretical insights into this crucial task are still lacking. In this work, we formalize concepts as discrete latent causal variables that are related via a hierarchical causal model that encodes different abstraction levels of concepts embedded in high-dimensional data (e.g., a dog breed and its eye shapes in natural images). We formulate conditions to facilitate the identification of the proposed causal model, which reveals when learning such concepts from unsupervised data is possible. Our conditions permit complex causal hierarchical structures beyond latent trees and multi-level directed acyclic graphs in prior work and can handle high-dimensional, continuous observed variables, which is well-suited for unstructured data modalities such as images. We substantiate our theoretical claims with synthetic data experiments. Further, we discuss our theory's implications for understanding the underlying mechanisms of latent diffusion models and provide corresponding empirical evidence for our theoretical insights.