AITopics | Sutton, Charles

Collaborating Authors

Sutton, Charles

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UQE: A Query Engine for Unstructured Databases

Dai, Hanjun, Wang, Bethany Yixin, Wan, Xingchen, Dai, Bo, Yang, Sherry, Nova, Azade, Yin, Pengcheng, Phothilimthana, Phitchaya Mangpo, Sutton, Charles, Schuurmans, Dale

arXiv.org Machine LearningJun-23-2024

Analytics on structured data is a mature field with many successful methods. However, most real world data exists in unstructured form, such as images and conversations. We investigate the potential of Large Language Models (LLMs) to enable unstructured data analytics. In particular, we propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections. This engine accepts queries in a Universal Query Language (UQL), a dialect of SQL that provides full natural language flexibility in specifying conditions and operators. The new engine leverages the ability of LLMs to conduct analysis of unstructured data, while also allowing us to exploit advances in sampling and optimization techniques to achieve efficient and accurate query execution. In addition, we borrow techniques from classical compiler theory to better orchestrate the workflow between sampling methods and foundation model calls. We demonstrate the efficiency of UQE on data analytics across different modalities, including images, dialogs and reviews, across a range of useful query types, including conditional aggregation, semantic retrieval and abstraction aggregation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2407.09522

Country:

Europe (0.67)
North America > United States > Oregon (0.14)
North America > Canada > Alberta (0.14)

Genre:

Research Report (0.82)
Workflow (0.66)

Industry:

Media (0.47)
Leisure & Entertainment (0.47)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

NExT: Teaching Large Language Models to Reason about Code Execution

Ni, Ansong, Allamanis, Miltiadis, Cohan, Arman, Deng, Yinlin, Shi, Kensen, Sutton, Charles, Yin, Pengcheng

arXiv.org Artificial IntelligenceApr-22-2024

A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of how programs execute at run-time. To address this issue, we propose NExT, a method to teach LLMs to inspect the execution traces of programs (variable states of executed lines) and reason about their run-time behavior through chain-of-thought (CoT) rationales. Specifically, NExT uses self-training to bootstrap a synthetic training set of execution-aware rationales that lead to correct task solutions (e.g., fixed programs) without laborious manual annotation. Experiments on program repair tasks based on MBPP and HumanEval demonstrate that NExT improves the fix rate of a PaLM 2 model, by 26.1% and 14.3% absolute, respectively, with significantly improved rationale quality as verified by automated metrics and human raters. Our model can also generalize to scenarios where program traces are absent at test-time.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.14662

Country:

North America > Canada (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training Chain-of-Thought via Latent-Variable Inference

Phan, Du, Hoffman, Matthew D., Dohan, David, Douglas, Sholto, Le, Tuan Anh, Parisi, Aaron, Sountsov, Pavel, Sutton, Charles, Vikram, Sharad, Saurous, Rif A.

arXiv.org Artificial IntelligenceNov-28-2023

Large language models (LLMs) solve problems more accurately and interpretably when instructed to work out the answer step by step using a ``chain-of-thought'' (CoT) prompt. One can also improve LLMs' performance on a specific task by supervised fine-tuning, i.e., by using gradient ascent on some tunable parameters to maximize the average log-likelihood of correct answers from a labeled training set. Naively combining CoT with supervised tuning requires supervision not just of the correct answers, but also of detailed rationales that lead to those answers; these rationales are expensive to produce by hand. Instead, we propose a fine-tuning strategy that tries to maximize the \emph{marginal} log-likelihood of generating a correct answer using CoT prompting, approximately averaging over all possible rationales. The core challenge is sampling from the posterior over rationales conditioned on the correct answer; we address it using a simple Markov-chain Monte Carlo (MCMC) expectation-maximization (EM) algorithm inspired by the self-taught reasoner (STaR), memoized wake-sleep, Markovian score climbing, and persistent contrastive divergence. This algorithm also admits a novel control-variate technique that drives the variance of our gradient estimates to zero as the model improves. Applying our technique to GSM8K and the tasks in BIG-Bench Hard, we find that this MCMC-EM fine-tuning technique typically improves the model's accuracy on held-out examples more than STaR or prompt-tuning with or without CoT.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.02179

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Universal Self-Consistency for Large Language Model Generation

Chen, Xinyun, Aksitov, Renat, Alon, Uri, Ren, Jie, Xiao, Kefan, Yin, Pengcheng, Prakash, Sushant, Sutton, Charles, Wang, Xuezhi, Zhou, Denny

arXiv.org Artificial IntelligenceNov-28-2023

Self-consistency with chain-of-thought prompting (CoT) has demonstrated remarkable performance gains on various challenging tasks, by utilizing multiple reasoning paths sampled from large language models (LLMs). However, self-consistency relies on the answer extraction process to aggregate multiple solutions, which is not applicable to free-form answers. In this work, we propose Universal Self-Consistency (USC), which leverages LLMs themselves to select the most consistent answer among multiple candidates. We evaluate USC on a variety of benchmarks, including mathematical reasoning, code generation, long-context summarization, and open-ended question answering. On open-ended generation tasks where the original self-consistency method is not applicable, USC effectively utilizes multiple samples and improves the performance. For mathematical reasoning, USC matches the standard self-consistency performance without requiring the answer formats to be similar. Finally, without access to execution results, USC also matches the execution-based voting performance on code generation.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.17311

Country: Asia (0.69)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

LambdaBeam: Neural Program Search with Higher-Order Functions and Lambdas

Shi, Kensen, Dai, Hanjun, Li, Wen-Ding, Ellis, Kevin, Sutton, Charles

arXiv.org Artificial IntelligenceOct-28-2023

Search is an important technique in program synthesis that allows for adaptive strategies such as focusing on particular search directions based on execution results. Several prior works have demonstrated that neural models are effective at guiding program synthesis searches. However, a common drawback of those approaches is the inability to handle iterative loops, higher-order functions, or lambda functions, thus limiting prior neural searches from synthesizing longer and more general programs. We address this gap by designing a search algorithm called LambdaBeam that can construct arbitrary lambda functions that compose operations within a given DSL. We create semantic vector representations of the execution behavior of the lambda functions and train a neural policy network to choose which lambdas to construct during search, and pass them as arguments to higher-order functions to perform looping computations. Our experiments show that LambdaBeam outperforms neural, symbolic, and LLM-based techniques in an integer list manipulation domain.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.02049

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

Add feedback

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles

arXiv.org Artificial IntelligenceJul-25-2023

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, we can measure whether they compositionally generalize, that is, whether a model that has been trained on the simpler subtasks is subsequently able to solve more complex tasks. In this paper, we characterize several different forms of compositional generalization that are desirable in program synthesis, forming a meta-benchmark which we use to create generalization tasks for two popular datasets, RobustFill and DeepCoder. We then propose ExeDec, a novel decomposition-based synthesis strategy that predicts execution subgoals to solve problems step-by-step informed by program execution at each step. ExeDec has better synthesis performance and greatly improved compositional generalization ability compared to baselines.

artificial intelligence, machine learning, opération, (19 more...)

arXiv.org Artificial Intelligence

2307.13883

Country: North America > United States > Texas (0.14)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

A Probabilistic Framework for Modular Continual Learning

Valkov, Lazar, Srivastava, Akash, Chaudhuri, Swarat, Sutton, Charles

arXiv.org Artificial IntelligenceJun-10-2023

Modular approaches, which use a different composition of modules for each problem and avoid forgetting by design, have been shown to be a promising direction in continual learning (CL). However, searching through the large, discrete space of possible module compositions is a challenge because evaluating a composition's performance requires a round of neural network training. To address this challenge, we develop a modular CL framework, called PICLE, that accelerates search by using a probabilistic model to cheaply compute the fitness of each composition. The model combines prior knowledge about good module compositions with dataset-specific information. Its use is complemented by splitting up the search space into subsets, such as perceptual and latent subsets. We show that PICLE is the first modular CL algorithm to achieve different types of transfer while scaling to large search spaces. We evaluate it on two benchmark suites designed to capture different desiderata of CL techniques. On these benchmarks, PICLE offers significantly better performance than state-of-the-art CL baselines.

artificial intelligence, machine learning, sequence, (19 more...)

arXiv.org Artificial Intelligence

2306.06545

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Natural Language to Code Generation in Interactive Data Science Notebooks

Yin, Pengcheng, Li, Wen-Ding, Xiao, Kefan, Rao, Abhishek, Wen, Yeming, Shi, Kensen, Howland, Joshua, Bailey, Paige, Catasta, Michele, Michalewski, Henryk, Polozov, Alex, Sutton, Charles

arXiv.org Artificial IntelligenceDec-19-2022

Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2212.09248

Country:

Asia > India (0.46)
North America > United States (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.61)

Add feedback

Language Model Cascades

Dohan, David, Xu, Winnie, Lewkowycz, Aitor, Austin, Jacob, Bieber, David, Lopes, Raphael Gontijo, Wu, Yuhuai, Michalewski, Henryk, Saurous, Rif A., Sohl-dickstein, Jascha, Murphy, Kevin, Sutton, Charles

arXiv.org Artificial IntelligenceJul-28-2022

Prompted models have demonstrated impressive In this position paper, we argue that a useful unifying few-shot learning abilities. Repeated interactions framework for understanding and extending this disparate at test-time with a single model, or the body of work is in terms of probabilistic programming languages composition of multiple models together, further (PPL) extended to work with strings, instead of expands capabilities. These compositions are more atomic data types like integers and floats. That is, probabilistic models, and may be expressed in we use a PPL to define a joint probability model on stringvalued the language of graphical models with random random variables, parameterized using LMs, and variables whose values are complex data types then condition this model on string-valued observations in such as strings. Cases with control flow and dynamic order to compute a posterior over string-valued unknowns, structure require techniques from probabilistic which we can then infer. We call such a probabilistic programming, which allow implementing program a language model cascade. We show that this disparate model structures and inference strategies framework captures many recent approaches, and also allows in a unified language. We formalize several us to tackle more complex multi-step reasoning problems.

large language model, logic & formal reasoning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2207.10342

Country: North America > United States (0.68)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Compositional Generalization and Decomposition in Neural Program Synthesis

Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles

arXiv.org Machine LearningApr-7-2022

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what we can measure is whether they compositionally generalize, that is, whether a model that has been trained on the simpler subtasks is subsequently able to solve more complex tasks. In this paper, we focus on measuring the ability of learned program synthesizers to compositionally generalize. We first characterize several different axes along which program synthesis methods would be desired to generalize, e.g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data. Based on this characterization, we introduce a benchmark suite of tasks to assess these abilities based on two popular existing datasets, SCAN and RobustFill. Finally, we make first attempts to improve the compositional generalization ability of Transformer models along these axes through novel attention mechanisms that draw inspiration from a human-like decomposition strategy. Empirically, we find our modified Transformer models generally perform better than natural baselines, but the tasks remain challenging.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2204.03758

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback