AITopics | Rosenbaum, Andy

Collaborating Authors

Rosenbaum, Andy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CALICO: Conversational Agent Localization via Synthetic Data Generation

Rosenbaum, Andy, Kharazmi, Pegah, Banijamali, Ershad, Zeng, Lu, DiPersio, Christopher, Wei, Pan, Oz, Gokmen, Chung, Clement, Owczarzak, Karolina, Triefenbach, Fabian, Hamza, Wael

arXiv.org Artificial IntelligenceDec-6-2024

We present CALICO, a method to fine-tune Large Language Models (LLMs) to localize conversational agent training data from one language to another. For slots (named entities), CALICO supports three operations: verbatim copy, literal translation, and localization, i.e. generating slot values more appropriate in the target language, such as city and airport names located in countries where the language is spoken. Furthermore, we design an iterative filtering mechanism to discard noisy generated samples, which we show boosts the performance of the downstream conversational agent. To prove the effectiveness of CALICO, we build and release a new human-localized (HL) version of the MultiATIS++ travel information test set in 8 languages. Compared to the original human-translated (HT) version of the test set, we show that our new HL version is more challenging. We also show that CALICO out-performs state-of-the-art LINGUIST (which relies on literal slot translation out of context) both on the HT case, where CALICO generates more accurate slot translations, and on the HL case, where CALICO generates localized slots which are closer to the HL test set.

computational linguistic, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.05388

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.40)

Industry: Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)
(2 more...)

Add feedback

GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning

Namboori, Amani, Mangale, Shivam, Rosenbaum, Andy, Soltan, Saleh

arXiv.org Artificial IntelligenceApr-14-2024

The emergence of Large Language Models (LLMs) with capabilities like In-Context Learning (ICL) has ushered in new possibilities for data generation across various domains while minimizing the need for extensive data collection and modeling techniques. Researchers have explored ways to use this generated synthetic data to optimize smaller student models for reduced deployment costs and lower latency in downstream tasks. However, ICL-generated data often suffers from low quality as the task specificity is limited with few examples used in ICL. In this paper, we propose GeMQuAD - a semi-supervised learning approach, extending the WeakDAP framework, applied to a dataset generated through ICL with just one example in the target language using AlexaTM 20B Seq2Seq LLM. Through our approach, we iteratively identify high-quality data to enhance model performance, especially for low-resource multilingual setting in the context of Extractive Question Answering task. Our framework outperforms the machine translation-augmented model by 0.22/1.68 F1/EM (Exact Match) points for Hindi and 0.82/1.37 F1/EM points for Spanish on the MLQA dataset, and it surpasses the performance of model trained on an English-only dataset by 5.05/6.50 F1/EM points for Hindi and 3.81/3.69 points F1/EM for Spanish on the same dataset. Notably, our approach uses a pre-trained LLM for generation with no fine-tuning (FT), utilizing just a single annotated example in ICL to generate data, providing a cost-effective development process.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.09163

Country:

Europe (0.46)
North America > United States (0.28)

Genre:

Research Report (0.82)
Workflow (0.68)

Industry: Education (0.37)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

Soltan, Saleh, Rosenbaum, Andy, Falke, Tobias, Lu, Qin, Rumshisky, Anna, Hamza, Wael

arXiv.org Artificial IntelligenceJun-14-2023

Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each have advantages, however training both model types from scratch is computationally expensive. We explore recipes to improve pre-training efficiency by initializing one model from the other. (1) Extracting the encoder from a seq2seq model, we show it under-performs a Masked Language Modeling (MLM) encoder, particularly on sequence labeling tasks. Variations of masking during seq2seq training, reducing the decoder size, and continuing with a small amount of MLM training do not close the gap. (2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model. Overall, this two-stage approach is an efficient recipe to obtain both a multilingual encoder and a seq2seq model, matching the performance of training each model from scratch while reducing the total compute cost by 27%.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.08756

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

PLACES: Prompting Language Models for Social Conversation Synthesis

Chen, Maximillian, Papangelis, Alexandros, Tao, Chenyang, Kim, Seokhwan, Rosenbaum, Andy, Liu, Yang, Yu, Zhou, Hakkani-Tur, Dilek

arXiv.org Artificial IntelligenceFeb-16-2023

Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting. We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations. This includes various dimensions of conversation quality with human evaluation directly on the synthesized conversations, and interactive human evaluation of chatbots fine-tuned on the synthetically generated dataset. We additionally demonstrate that this prompting approach is generalizable to multi-party conversations, providing potential to create new synthetic data for multi-party tasks. Our synthetic multi-party conversations were rated more favorably across all measured dimensions compared to conversation excerpts sampled from a human-collected multi-party dataset.

artificial intelligence, chatbot, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.03269

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Sports (0.93)
Health & Medicine (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Chen, Maximillian, Papangelis, Alexandros, Tao, Chenyang, Rosenbaum, Andy, Kim, Seokhwan, Liu, Yang, Yu, Zhou, Hakkani-Tur, Dilek

arXiv.org Artificial IntelligenceNov-2-2022

Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. Models fine-tuned on our augmented data mixed with few-shot ground truth data are able to approach or surpass existing state-of-the-art performance on both datasets. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2210.14169

Country: Asia (0.28)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)

Add feedback