Towards Causal Foundation Model: on Duality between Causal Inference and Attention

Zhang, Jiaqi, Jennings, Joel, Zhang, Cheng, Ma, Chao

Oct-1-2023–arXiv.org Machine Learning

Recent advances in artificial intelligence have created a paradigm shift in which models are trained on large amounts of data and can be adapted to different tasks, dubbed foundation models (Bommasani et al., 2021). These models, which often employ self-supervision, can extract valuable knowledge from various types of data, including natural language (Devlin et al., 2018; Brown et al., 2020), images (Radford et al., 2021), and biological sequencing counts (Theodoris et al., 2023). This acquired knowledge allows the model to generalize when asked to perform tasks in novel scenarios. With vast amounts of data becoming increasingly available from diverse sources, such models are of interest to leverage information that can be learned in order to build more intelligent systems (Bubeck et al., 2023). A critical aspect of intelligent systems is the ability to reason about cause-and-effect relationships (Zhang et al., 2023), which is vital to making informed decisions across various domains, including healthcare, economics, and statistics (Kube et al., 2019; Geffner et al., 2022; Zhang et al., 2022). Relying solely on correlation-based models (Harrison and March, 1984) can lead to misleading conclusions, as they do not account for the underlying causal mechanisms. This limitation is also observed in the realm of foundation models (Bubeck et al., 2023; Mahowald et al., 2023; Wolfram, 2023).

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

Oct-1-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.14)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine > Public Health (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.46)
    - Statistical Learning > Support Vector Machines (0.46)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found