AITopics | Fisher, Zachary

Collaborating Authors

Fisher, Zachary

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

STRUM-LLM: Attributed and Structured Contrastive Summarization

Gunel, Beliz, Wendt, James B., Xie, Jing, Zhou, Yichao, Vo, Nguyen, Fisher, Zachary, Tata, Sandeep

arXiv.org Artificial IntelligenceMar-25-2024

Users often struggle with decision-making between two options (A vs B), as it usually requires time-consuming research across multiple web pages. We propose STRUM-LLM that addresses this challenge by generating attributed, structured, and helpful contrastive summaries that highlight key differences between the two options. STRUM-LLM identifies helpful contrast: the specific attributes along which the two options differ significantly and which are most likely to influence the user's decision. Our technique is domain-agnostic, and does not require any human-labeled data or fixed attribute list as supervision. STRUM-LLM attributes all extractions back to the input sources along with textual evidence, and it does not have a limit on the length of input sources that it can process. STRUM-LLM Distilled has 100x more throughput than the models with comparable performance while being 10x smaller. In this paper, we provide extensive evaluations for our method and lay out future directions for our currently deployed system.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.1971

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Aksitov, Renat, Miryoosefi, Sobhan, Li, Zonglin, Li, Daliang, Babayan, Sheila, Kopparapu, Kavya, Fisher, Zachary, Guo, Ruiqi, Prakash, Sushant, Srinivasan, Pranesh, Zaheer, Manzil, Yu, Felix, Kumar, Sanjiv

arXiv.org Artificial IntelligenceDec-15-2023

Answering complex natural language questions often necessitates multi-step reasoning and integrating external information. Several systems have combined knowledge retrieval with a large language model (LLM) to answer such questions. These systems, however, suffer from various failure cases, and we cannot directly train them end-to-end to fix such failures, as interaction with external knowledge is non-differentiable. To address these deficiencies, we define a ReAct-style LLM agent with the ability to reason and act upon external knowledge. We further refine the agent through a ReST-like method that iteratively trains on previous trajectories, employing growing-batch reinforcement learning with AI feedback for continuous self-improvement and self-distillation. Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model that achieves comparable performance on challenging compositional question-answering benchmarks with two orders of magnitude fewer parameters.

answer, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.10003

Country: North America > United States > Ohio (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Renewable > Solar (0.68)
Health & Medicine (0.68)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Add feedback

Making Transformers Solve Compositional Tasks

Ontañón, Santiago, Ainslie, Joshua, Cvicek, Vaclav, Fisher, Zachary

arXiv.org Artificial IntelligenceAug-9-2021

Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models showing that the inductive biases given to the model by several design decisions significantly impact compositional generalization. Through this exploration, we identified Transformer configurations that generalize compositionally significantly better than previously reported in the literature in a diverse set of compositional tasks, and that achieve state-of-the-art results in a semantic parsing compositional generalization benchmark (COGS), and a string edit operation composition benchmark (PCFG).

deep learning, generalization, neural network, (17 more...)

arXiv.org Artificial Intelligence

2108.04378

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback