AITopics | Li, Ella

Collaborating Authors

Li, Ella

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios

Feng, Xiachong, Dou, Longxu, Li, Ella, Wang, Qinghao, Wang, Haochuan, Guo, Yu, Ma, Chang, Kong, Lingpeng

arXiv.org Artificial IntelligenceDec-5-2024

Game-theoretic scenarios have become pivotal in evaluating the social intelligence of Large Language Model (LLM)-based social agents. While numerous studies have explored these agents in such settings, there is a lack of a comprehensive survey summarizing the current progress. To address this gap, we systematically review existing research on LLM-based social agents within game-theoretic scenarios. Our survey organizes the findings into three core components: Game Framework, Social Agent, and Evaluation Protocol. The game framework encompasses diverse game scenarios, ranging from choice-focusing to communication-focusing games. The social agent part explores agents' preferences, beliefs, and reasoning abilities. The evaluation protocol covers both game-agnostic and game-specific metrics for assessing agent performance. By reflecting on the current research and identifying future research directions, this survey provides insights to advance the development and evaluation of social agents in game-theoretic scenarios.

arxiv preprint, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.0392

Country:

Europe (0.68)
North America > United States (0.46)
Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Held, William, Li, Ella, Ryan, Michael, Shi, Weiyan, Zhang, Yanzhe, Yang, Diyi

arXiv.org Artificial IntelligenceOct-3-2024

Voice assistants, such as Siri and Google Assistant, typically model audio and text separately, resulting in lost speech information and increased complexity. Recent efforts to address this with end-to-end Speech Large Language Models (LLMs) trained with supervised finetuning (SFT) have led to models "forgetting" capabilities from text-only LLMs. Our work proposes an alternative paradigm for training Speech LLMs without instruction data, using the response of a text-only LLM to transcripts as self-supervision. Importantly, this process can be performed without annotated responses. We show that our Distilled Voice Assistant (DiVA) generalizes to Spoken Question Answering, Classification, and Translation. Furthermore, we show that DiVA better meets user preferences, achieving a 72% win rate compared with state-of-the-art models like Qwen 2 Audio, despite using >100x less training compute. Figure 1: Training Pipeline for Distilled Voice Assistant (DiVA), Red indicates trainable components while Blue indicates frozen pretrained modules. DiVA modifies a text-only LLM into a general purpose Speech LLM by using the model's own responses to transcribed speech as self-supervision. As Large Language Models (LLMs) capabilities increase, so does the value of bringing these capabilities to new modalities, including audio and speech (Shu et al., 2023; Wang et al., 2023; Gong et al., 2023). Speech is a natural interaction surface for language technology (Murad et al., 2019), offering measurable efficiency gains for users (Ruan et al., 2018). One straightforward method of integrating speech with LLMs is to feed audio to an Automatic Speech Recognition (ASR) model and produce a text transcription for the LLM to use. All authors besides first and last sorted alphabetically.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.02678

Country:

Asia (1.00)
Europe (0.93)
North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback