Supplementary Information: Acausalviewofcompositionalzero-shotrecognition
–Neural Information Processing Systems
Next, we introduce two additional approximations we use to apply Eq. (S.9). An SCM matches a set of assignments to a causal graph. This implies that the error of the approximation Eq. (S.13) is mainly dominated by the gradients of g at hao, and the variance ofnao. Specifically, we use a positive differentiable measure of the statistical dependence, denoted by I. PIDA measures disentanglement of representations for models that are trained from unsupervised data. As a result, we have the following: Minimizing Eq. (S.21) leads topdo(a,o)(ˆφa0) approaching p(ˆφa0|a), which as we have just shown, leads top(ˆφa0|a) approaching pdo(a)(ˆφa0).
Neural Information Processing Systems
Feb-7-2026, 12:16:07 GMT
- Technology: