AITopics

2410.03767

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

arXiv.org Artificial IntelligenceNov-6-2023

Beyond Words: A Mathematical Framework for Interpreting Large Language Models

González, Javier, Nori, Aditya V.

Large language models (LLMs) are powerful AI tools that can generate and comprehend natural language text and other complex information. However, the field lacks a mathematical framework to systematically describe, compare and improve LLMs. We propose Hex a framework that clarifies key terms and concepts in LLM research, such as hallucinations, alignment, self-verification and chain-of-thought reasoning. The Hex framework offers a precise and consistent way to characterize LLMs, identify their strengths and weaknesses, and integrate new findings. Using Hex, we differentiate chain-of-thought reasoning from chain-of-thought prompting and establish the conditions under which they are equivalent. This distinction clarifies the basic assumptions behind chain-of-thought prompting and its implications for methods that use it, such as self-verification and prompt programming. Our goal is to provide a formal framework for LLMs that can help both researchers and practitioners explore new possibilities for generative AI. We do not claim to have a definitive solution, but rather a tool for opening up new research avenues. We argue that our formal definitions and results are crucial for advancing the discussion on how to build generative AI systems that are safe, reliable, fair and robust, especially in domains like healthcare and software engineering.

large language model, machine learning, natural language, (20 more...)

2311.03033

Country: North America > United States > Illinois (0.32)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine (0.48)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)

arXiv.org Artificial IntelligenceNov-6-2023

TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models

González, Javier, Wong, Cliff, Gero, Zelalem, Bagga, Jass, Ueno, Risa, Chien, Isabel, Oravkin, Eduard, Kiciman, Emre, Nori, Aditya, Weerasinghe, Roshanthi, Leidner, Rom S., Piening, Brian, Naumann, Tristan, Bifulco, Carlo, Poon, Hoifung

The rapid digitization of real-world data offers an unprecedented opportunity for optimizing healthcare delivery and accelerating biomedical discovery. In practice, however, such data is most abundantly available in unstructured forms, such as clinical notes in electronic medical records (EMRs), and it is generally plagued by confounders. In this paper, we present TRIALSCOPE, a unifying framework for distilling real-world evidence from population-level observational data. TRIALSCOPE leverages biomedical language models to structure clinical text at scale, employs advanced probabilistic modeling for denoising and imputation, and incorporates state-of-the-art causal inference techniques to combat common confounders. Using clinical trial specification as generic representation, TRIALSCOPE provides a turn-key solution to generate and reason with clinical hypotheses using observational data. In extensive experiments and analyses on a large-scale real-world dataset with over one million cancer patients from a large US healthcare network, we show that TRIALSCOPE can produce high-quality structuring of real-world data and generates comparable results to marquee cancer trials. In addition to facilitating in-silicon clinical trial design and optimization, TRIALSCOPE may be used to empower synthetic controls, pragmatic trials, post-market surveillance, as well as support fine-grained patient-like-me reasoning in precision diagnosis and treatment.

large language model, machine learning, simulation, (19 more...)

2311.01301

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.94)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningOct-26-2021

Dynamic Causal Bayesian Optimization

Aglietti, Virginia, Dhir, Neil, González, Javier, Damoulas, Theodoros

This paper studies the problem of performing a sequence of optimal interventions in a causal dynamical system where both the target variable of interest and the inputs evolve over time. This problem arises in a variety of domains e.g. system biology and operational research. Dynamic Causal Bayesian Optimization (DCBO) brings together ideas from sequential decision making, causal inference and Gaussian process (GP) emulation. DCBO is useful in scenarios where all causal effects in a graph are changing over time. At every time step DCBO identifies a local optimal intervention by integrating both observational and past interventional data collected from the system. We give theoretical results detailing how one can transfer interventional information across time steps and define a dynamic causal GP model which can be used to quantify uncertainty and find optimal interventions in practice. We demonstrate how DCBO identifies optimal interventions faster than competing approaches in multiple settings and applications.

artificial intelligence, banking & finance, machine learning, (18 more...)

2110.13891

Country:

North America > United States > Virginia (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Machine LearningJun-7-2021

BayesIMP: Uncertainty Quantification for Causal Data Fusion

Chau, Siu Lun, Ton, Jean-François, González, Javier, Teh, Yee Whye, Sejdinovic, Dino

While causal models are becoming one of the mainstays of machine learning, the problem of uncertainty quantification in causal inference remains challenging. In this paper, we study the causal data fusion problem, where datasets pertaining to multiple causal graphs are combined to estimate the average treatment effect of a target variable. As data arises from multiple sources and can vary in quality and quantity, principled uncertainty quantification becomes essential. To that end, we introduce Bayesian Interventional Mean Processes, a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space, while taking into account the uncertainty within each causal graph. To demonstrate the utility of our uncertainty estimation, we apply our method to the Causal Bayesian Optimisation task and show improvements over state-of-the-art methods.

aye imp, bayesian inference, information fusion, (20 more...)

2106.03477

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningSep-27-2020

Multi-task Causal Learning with Gaussian Processes

Aglietti, Virginia, Damoulas, Theodoros, Álvarez, Mauricio, González, Javier

This paper studies the problem of learning the correlation structure of a set of intervention functions defined on the directed acyclic graph (DAG) of a causal model. This is useful when we are interested in jointly learning the causal effects of interventions on different subsets of variables in a DAG, which is common in field such as healthcare or operations research. We propose the first multi-task causal Gaussian process (GP) model, which we call DAG-GP, that allows for information sharing across continuous interventions and across experiments on different variables. DAG-GP accommodates different assumptions in terms of data availability and captures the correlation between functions lying in input spaces of different dimensionality via a well-defined integral operator. We give theoretical results detailing when and how the DAG-GP model can be formulated depending on the DAG. We test both the quality of its predictions and its calibrated uncertainties. Compared to single-task models, DAG-GP achieves the best fitting performance in a variety of real and synthetic settings. In addition, it helps to select optimal interventions faster than competing approaches when used within sequential decision making frameworks, like active learning or Bayesian optimization.

artificial intelligence, dag, health & medicine, (18 more...)

2009.12821

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

arXiv.org Artificial IntelligenceJul-18-2020

Structure Mapping for Transferability of Causal Models

Pruthi, Purva, González, Javier, Lu, Xiaoyu, Fiterau, Madalina

Human beings learn causal models and constantly use them to transfer knowledge between similar environments. We use this intuition to design a transfer-learning framework using object-oriented representations to learn the causal relationships between objects. A learned causal dynamics model can be used to transfer between variants of an environment with exchangeable perceptual features among objects but with the same underlying causal dynamics. We adapt continuous optimization for structure learning techniques to explicitly learn the cause and effects of the actions in an interactive environment and transfer to the target domain by categorization of the objects based on causal knowledge. We demonstrate the advantages of our approach in a gridworld setting by combining causal model-based approach with model-free approach in reinforcement learning.

agent, artificial intelligence, object-oriented architecture, (17 more...)

2007.09445

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.70)

arXiv.org Machine LearningJun-6-2020

Learning Inconsistent Preferences with Kernel Methods

Chau, Siu Lun, González, Javier, Sejdinovic, Dino

We propose a probabilistic kernel approach for preferential learning from pairwise duelling data using Gaussian Processes. Different from previous methods, we do not impose a total order on the item space, hence can capture more expressive latent preferential structures such as inconsistent preferences and clusters of comparable items. Furthermore, we prove the universality of the proposed kernels, i.e. that the corresponding reproducing kernel Hilbert Space (RKHS) is dense in the space of skew-symmetric preference functions. To conclude the paper, we provide an extensive set of numerical experiments on simulated and real-world datasets showcasing the competitiveness of our proposed method with state-of-the-art.

artificial intelligence, kernel, machine learning, (15 more...)

2006.03847

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Leisure & Entertainment > Sports > Football (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Machine LearningMay-26-2020

Causal Bayesian Optimization

Aglietti, Virginia, Lu, Xiaoyu, Paleyes, Andrei, González, Javier

This paper studies the problem of globally optimizing a variable of interest that is part of a causal model in which a sequence of interventions can be performed. This problem arises in biology, operational research, communications and, more generally, in all fields where the goal is to optimize an output metric of a system of interconnected nodes. Our approach combines ideas from causal inference, uncertainty quantification and sequential decision making. In particular, it generalizes Bayesian optimization, which treats the input variables of the objective function as independent, to scenarios where causal information is available. We show how knowing the causal graph significantly improves the ability to reason about optimal decision making strategies decreasing the optimization cost while avoiding suboptimal solutions. We propose a new algorithm called Causal Bayesian Optimization (CBO). CBO automatically balances two trade-offs: the classical exploration-exploitation and the new observation-intervention, which emerges when combining real interventional data with the estimated intervention effects computed via do-calculus. We demonstrate the practical benefits of this method in a synthetic setting and in two real-world applications.

intervention, optimization problem, upstream oil & gas, (17 more...)

2005.11741

Country:

North America > United States (0.15)
Europe > Italy (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

arXiv.org Machine LearningMar-18-2019

Deep Gaussian Processes for Multi-fidelity Modeling

Cutajar, Kurt, Pullin, Mark, Damianou, Andreas, Lawrence, Neil, González, Javier

Multi-fidelity methods are prominently used when cheaply-obtained, but possibly biased and noisy, observations must be effectively combined with limited or expensive true data in order to construct reliable models. This arises in both fundamental machine learning procedures such as Bayesian optimization, as well as more practical science and engineering applications. In this paper we develop a novel multi-fidelity model which treats layers of a deep Gaussian process as fidelity levels, and uses a variational inference scheme to propagate uncertainty across them. This allows for capturing nonlinear correlations between fidelities with lower risk of overfitting than existing methods exploiting compositional structure, which are conversely burdened by structural assumptions and constraints. We show that the proposed approach makes substantial improvements in quantifying and propagating uncertainty in multi-fidelity set-ups, which in turn improves their effectiveness in decision making pipelines.

artificial intelligence, fidelity, neural network, (16 more...)

1903.0732

Genre: Research Report (0.82)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)