AITopics | Kwon, Minae

Collaborating Authors

Kwon, Minae

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Human-Language Model Interaction

Lee, Mina, Srivastava, Megha, Hardy, Amelia, Thickstun, John, Durmus, Esin, Paranjape, Ashwin, Gerard-Ursin, Ines, Li, Xiang Lisa, Ladhak, Faisal, Rong, Frieda, Wang, Rose E., Kwon, Minae, Park, Joon Sung, Cao, Hancheng, Lee, Tony, Bommasani, Rishi, Bernstein, Michael, Liang, Percy

arXiv.org Artificial IntelligenceJan-5-2024

Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.09746

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Law Enforcement & Public Safety (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)
(2 more...)

Add feedback

Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections

Zha, Lihan, Cui, Yuchen, Lin, Li-Heng, Kwon, Minae, Arenas, Montserrat Gonzalez, Zeng, Andy, Xia, Fei, Sadigh, Dorsa

arXiv.org Artificial IntelligenceNov-17-2023

Today's robot policies exhibit subpar performance when faced with the challenge of generalizing to novel environments. Human corrective feedback is a crucial form of guidance to enable such generalization. However, adapting to and learning from online human corrections is a non-trivial endeavor: not only do robots need to remember human feedback over time to retrieve the right information in new settings and reduce the intervention rate, but also they would need to be able to respond to feedback that can be arbitrary corrections about high-level human preferences to low-level adjustments to skill parameters. In this work, we present Distillation and Retrieval of Online Corrections (DROC), a large language model (LLM)-based system that can respond to arbitrary forms of language feedback, distill generalizable knowledge from corrections, and retrieve relevant past experiences based on textual and visual similarity for improving performance in novel settings. DROC is able to respond to a sequence of online language corrections that address failures in both high-level task plans and low-level skill primitives. We demonstrate that DROC effectively distills the relevant information from the sequence of online corrections in a knowledge base and retrieves that knowledge in settings with new task or object instances. DROC outperforms other techniques that directly generate robot code via LLMs by using only half of the total number of corrections needed in the first round and requires little to no corrections after two iterations. We show further results, videos, prompts and code on https://sites.google.com/stanford.edu/droc .

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.10678

Country: North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Toward Grounded Social Reasoning

Kwon, Minae, Hu, Hengyuan, Myers, Vivek, Karamcheti, Siddharth, Dragan, Anca, Sadigh, Dorsa

arXiv.org Artificial IntelligenceJun-14-2023

Consider a robot tasked with tidying a desk with a meticulously constructed Lego sports car. A human may recognize that it is not socially appropriate to disassemble the sports car and put it away as part of the "tidying". How can a robot reach that conclusion? Although large language models (LLMs) have recently been used to enable social reasoning, grounding this reasoning in the real world has been challenging. To reason in the real world, robots must go beyond passively querying LLMs and *actively gather information from the environment* that is required to make the right decision. For instance, after detecting that there is an occluded car, the robot may need to actively perceive the car to know whether it is an advanced model car made out of Legos or a toy car built by a toddler. We propose an approach that leverages an LLM and vision language model (VLM) to help a robot actively perceive its environment to perform grounded social reasoning. To evaluate our framework at scale, we release the MessySurfaces dataset which contains images of 70 real-world surfaces that need to be cleaned. We additionally illustrate our approach with a robot on 2 carefully designed surfaces. We find an average 12.9% improvement on the MessySurfaces benchmark and an average 15% improvement on the robot experiments over baselines that do not use active perception. The dataset, code, and videos of our approach can be found at https://minaek.github.io/groundedsocialreasoning.

artificial intelligence, information, robot, (17 more...)

arXiv.org Artificial Intelligence

2306.08651

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Passenger (0.69)
Transportation > Ground > Road (0.55)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reward Design with Language Models

Kwon, Minae, Xie, Sang Michael, Bullard, Kalesha, Sadigh, Dorsa

arXiv.org Artificial IntelligenceFeb-27-2023

Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides a textual prompt containing a few examples (few-shot) or a description (zero-shot) of the desired behavior. Our approach leverages this proxy reward function in an RL framework. Specifically, users specify a prompt once at the beginning of training. During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal. The RL agent then uses this reward to update its behavior. We evaluate whether our approach can train agents aligned with user objectives in the Ultimatum Game, matrix games, and the DealOrNoDeal negotiation task. In all three tasks, we show that RL agents trained with our framework are well-aligned with the user's objectives and outperform RL agents trained with reward functions learned via supervised learning

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2303.00001

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Targeted Data Acquisition for Evolving Negotiation Agents

Kwon, Minae, Karamcheti, Siddharth, Cuellar, Mariano-Florentino, Sadigh, Dorsa

arXiv.org Artificial IntelligenceJun-16-2021

Consider a standard non-cooperative negotiation game (Deming et al., 1944; Successful negotiators must learn how to balance Nash, 1950; 1951) as shown in Figure 1 where two agents - optimizing for self-interest and cooperation. Yet Alice and Bob - are trying to agree on an allocation of shared current artificial negotiation agents often heavily resources. Both have high utility associated with the hats depend on the quality of the static datasets they and balls, though Alice also cares about books. Effectively were trained on, limiting their capacity to fashion employing negotiation is crucial, and is the only way to an adaptive response balancing self-interest and reach an equitable outcome - dividing the hats and balls cooperation. For this reason, we find that these evenly, while giving Alice the book. Even where negotiating agents can achieve either high utility or cooperation, agents have incentives that make it challenging for them to but not both. To address this, we introduce cooperate, it would be difficult to imagine that negotiation a targeted data acquisition framework where we could be useful to agents over time -- let alone society -- guide the exploration of a reinforcement learning if agents were incapable of cooperating to achieve equitable agent using annotations from an expert oracle.

artificial intelligence, natural language, negotiation, (16 more...)

arXiv.org Artificial Intelligence

2106.07728

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre:

Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.48)
Questionnaire & Opinion Survey (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback