AITopics | otpot qa

Collaborating Authors

otpot qa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Wu, Shirley, Sarthi, Parth, Zhao, Shiyu, Lee, Aaron, Shandilya, Herumb, Grobelnik, Adrian Mladenic, Choudhary, Nurendra, Huang, Eddie, Subbian, Karthik, Zhang, Linjun, Yang, Diyi, Zou, James, Leskovec, Jure

arXiv.org Artificial IntelligenceOct-7-2025

Compound AI systems integrating multiple components, such as Large Language Models, specialized tools, and traditional machine learning models, are increasingly deployed to solve complex real-world tasks. However, optimizing compound systems remains challenging due to their non-differentiable structures and diverse configuration types across components, including prompts, hyperparameters, and model parameters. To address this challenge, we propose Optimas, a unified framework for effective optimization of compound systems. The core idea of Optimas is to maintain one Local Reward Function (LRF) per component, each satisfying a local-global alignment property, i.e., each component's local reward correlates with the global system performance. In each iteration, Optimas efficiently adapts the LRFs to maintain this property while simultaneously maximizing each component's local reward. This approach enables independent updates of heterogeneous configurations using the designated optimization method, while ensuring that local improvements consistently lead to performance gains. We present extensive evaluations across five real-world compound systems to demonstrate that Optimas outperforms strong baselines by an average improvement of 11.92%, offering a general and effective approach for improving compound systems. Our website is at https://optimas.stanford.edu.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.03041

Country: North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

METAREFLECTION: Learning Instructions for Language Agents using Past Reflections

Gupta, Priyanshu, Kirtania, Shashank, Singha, Ananya, Gulwani, Sumit, Radhakrishna, Arjun, Shi, Sherry, Soares, Gustavo

arXiv.org Artificial IntelligenceMay-13-2024

Despite the popularity of Large Language Models (LLMs), crafting specific prompts for LLMs to perform particular tasks remains challenging. Users often engage in multiple conversational turns with an LLM-based agent to accomplish their intended task. Recent studies have demonstrated that linguistic feedback, in the form of self-reflections generated by the model, can work as reinforcement during these conversations, thus enabling quicker convergence to the desired outcome. Motivated by these findings, we introduce METAREFLECTION, a novel technique that learns general prompt instructions for a specific domain from individual self-reflections gathered during a training phase. We evaluate our technique in two domains: Infrastructure as Code (IAC) vulnerability detection and question-answering (QA) using REACT and COT. Our results demonstrate a notable improvement, with METARELECTION outperforming GPT-4 by 16.82% (IAC), 31.33% (COT), and 15.42% (REACT), underscoring the potential of METAREFLECTION as a viable method for enhancing the efficiency of LLMs.

agent, eflection, instruction, (15 more...)

arXiv.org Artificial Intelligence

2405.13009

Country:

Asia > Singapore (0.04)
Africa > Nigeria (0.04)
North America > United States > New York > New York County > New York City (0.04)
Africa > Ghana (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.94)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

Zhu, Wang, Thomason, Jesse, Jia, Robin

arXiv.org Artificial IntelligenceDec-23-2023

We train a language model (LM) to robustly answer multistep questions by generating and answering sub-questions. We propose Chain-of-Questions, a framework that trains a model to generate sub-questions and sub-answers one at a time by leveraging human annotated question decomposition meaning representation (QDMR). The key technical challenge is that QDMR only contains sub-questions but not answers to those sub-questions, so we treat sub-answers as latent variables and optimize them using a novel dynamic mixture of Hard-EM and MAPO. Chain-of-Questions greatly outperforms strong neuro-symbolic methods by 9.0 F1 on DROP contrast set, and outperforms GPT-3.5 by 24.3 F1 on HOTPOTQA adversarial set, thus demonstrating the effectiveness and robustness of our framework.

otpot qa, qdmr, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2305.14901

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Africa > Togo (0.05)
North America > United States > Montana (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry:

Government (0.47)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Rissanen Data Analysis: Examining Dataset Characteristics via Description Length

Perez, Ethan, Kiela, Douwe, Cho, Kyunghyun

arXiv.org Artificial IntelligenceMar-5-2021

We introduce a method to determine if a certain capability helps to achieve an accurate model of given data. We view labels as being generated from the inputs by a program composed of subroutines with different capabilities, and we posit that a subroutine is useful if and only if the minimal program that invokes it is shorter than the one that does not. Since minimum program length is uncomputable, we instead estimate the labels' minimum description length (MDL) as a proxy, giving us a theoretically-grounded method for analyzing dataset characteristics. We call the method Rissanen Data Analysis (RDA) after the father of MDL, and we showcase its applicability on a wide variety of settings in NLP, ranging from evaluating the utility of generating subquestions before answering a question, to analyzing the value of rationales and explanations, to investigating the importance of different parts of speech, and uncovering dataset gender bias.

dataset, mdl, subquestion, (16 more...)

arXiv.org Artificial Intelligence

2103.03872

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(16 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.66)

Add feedback

Semantics Altering Modifications for Evaluating Comprehension in Machine Reading

Schlegel, Viktor, Nenadic, Goran, Batista-Navarro, Riza

arXiv.org Artificial IntelligenceDec-7-2020

Advances in NLP have yielded impressive results for the task of machine reading comprehension (MRC), with approaches having been reported to achieve performance comparable to that of humans. In this paper, we investigate whether state-of-the-art MRC models are able to correctly process Semantics Altering Modifications (SAM): linguistically-motivated phenomena that alter the semantics of a sentence while preserving most of its lexical surface form. We present a method to automatically generate and align challenge sets featuring original and altered examples. We further propose a novel evaluation methodology to correctly assess the capability of MRC systems to process these examples independent of the data they were optimised on, by discounting for effects introduced by domain shift. In a large-scale empirical study, we apply the methodology in order to evaluate extractive MRC models with regard to their capability to correctly process SAM-enriched data. We comprehensively cover 12 different state-of-the-art neural architecture configurations and four training datasets and find that -- despite their well-known remarkable performance -- optimised models consistently struggle to correctly process semantically altered data.

baseline, dataset, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2012.04056

Country: Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.89)
Leisure & Entertainment > Sports > Soccer (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback