Towards Causal Foundation Model: on Duality between Causal Inference and Attention
Zhang, Jiaqi, Jennings, Joel, Zhang, Cheng, Ma, Chao
Recent advances in artificial intelligence have created a paradigm shift in which models are trained on large amounts of data and can be adapted to different tasks, dubbed foundation models (Bommasani et al., 2021). These models, which often employ self-supervision, can extract valuable knowledge from various types of data, including natural language (Devlin et al., 2018; Brown et al., 2020), images (Radford et al., 2021), and biological sequencing counts (Theodoris et al., 2023). This acquired knowledge allows the model to generalize when asked to perform tasks in novel scenarios. With vast amounts of data becoming increasingly available from diverse sources, such models are of interest to leverage information that can be learned in order to build more intelligent systems (Bubeck et al., 2023). A critical aspect of intelligent systems is the ability to reason about cause-and-effect relationships (Zhang et al., 2023), which is vital to making informed decisions across various domains, including healthcare, economics, and statistics (Kube et al., 2019; Geffner et al., 2022; Zhang et al., 2022). Relying solely on correlation-based models (Harrison and March, 1984) can lead to misleading conclusions, as they do not account for the underlying causal mechanisms. This limitation is also observed in the realm of foundation models (Bubeck et al., 2023; Mahowald et al., 2023; Wolfram, 2023).
Oct-1-2023
- Country:
- North America > United States > Massachusetts (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Public Health (0.93)
- Technology: