On the Causal Sufficiency and Necessity of Multi-Modal Representation Learning

Open in new window