Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

Rajendran, Goutham, Buchholz, Simon, Aragam, Bryon, Schölkopf, Bernhard, Ravikumar, Pradeep

arXiv.org Machine Learning 

A key goal of modern machine learning is to learn representations of complex data that are humaninterpretable and can be controlled. This goal is of paramount importance given the breadth and importance of ML in today's world. There seem to be two broad approaches toward such intelligent systems. The first approach is to build models that are inherently interpretable and then subsequently focus on how to extract maximum performance from them; and the second approach is to build highperformance neural models, and then subsequently invest efforts to understand the inner workings of such models. A prominent example of the first camp is the field of Causal Representation Learning (CRL) [82, 81].