AITopics | Lin, Zhaojiang

Plotting

Lin, Zhaojiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

Li, Zekun, Chen, Zhiyu Zoey, Ross, Mike, Huber, Patrick, Moon, Seungwhan, Lin, Zhaojiang, Dong, Xin Luna, Sagar, Adithya, Yan, Xifeng, Crook, Paul A.

arXiv.org Artificial IntelligenceFeb-16-2024

Large language models (LLMs) are increasingly prevalent in conversational systems due to their advanced understanding and generative capabilities in general contexts. However, their effectiveness in task-oriented dialogues (TOD), which requires not only response generation but also effective dialogue state tracking (DST) within specific tasks and domains, remains less satisfying. In this work, we propose a novel approach FnCTOD for solving DST with LLMs through function calling. This method improves zero-shot DST, allowing adaptation to diverse domains without extensive data collection or model tuning. Our experimental results demonstrate that our approach achieves exceptional performance with both modestly sized open-source and also proprietary LLMs: with in-context prompting it enables various 7B or 13B parameter models to surpass the previous state-of-the-art (SOTA) achieved by ChatGPT, and improves ChatGPT's performance beating the SOTA by 5.6% Avg. JGA. Individual model results for GPT-3.5 and GPT-4 are boosted by 4.8% and 14%, respectively. We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities. We plan to open-source experimental code and model.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.10466

Country:

North America > United States > California (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continual Dialogue State Tracking via Example-Guided Question Answering

Cho, Hyundong, Madotto, Andrea, Lin, Zhaojiang, Chandu, Khyathi Raghavi, Kottur, Satwik, Xu, Jing, May, Jonathan, Sankar, Chinnadhurai

arXiv.org Artificial IntelligenceDec-14-2023

Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services. Motivated by the insight that dialogue state tracking (DST), a crucial component of dialogue systems that estimates the user's goal as a conversation proceeds, is a simple natural language understanding task, we propose reformulating it as a bundle of granular example-guided question answering tasks to minimize the task shift between services and thus benefit continual learning. Our approach alleviates service-specific memorization and teaches a model to contextualize the given question and example to extract the necessary information from the conversation. We find that a model with just 60M parameters can achieve a significant boost by learning to learn from in-context examples retrieved by a retriever trained to identify turns with similar dialogue state changes. Combining our method with dialogue-level memory replay, our approach attains state of the art performance on DST continual learning metrics without relying on any complex regularization or parameter expansion methods.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2305.13721

Country:

Europe (0.93)
Asia > Middle East > UAE (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Moon, Seungwhan, Madotto, Andrea, Lin, Zhaojiang, Nagarajan, Tushar, Smith, Matt, Jain, Shashank, Yeh, Chun-Fu, Murugesan, Prakash, Heidari, Peyman, Liu, Yue, Srinet, Kavya, Damavandi, Babak, Kumar, Anuj

arXiv.org Artificial IntelligenceSep-27-2023

We present Any-Modality Augmented Language Model (AnyMAL), a unified model that reasons over diverse input modality signals (i.e. text, image, video, audio, IMU motion sensor), and generates textual responses. AnyMAL inherits the powerful text-based reasoning abilities of the state-of-the-art LLMs including LLaMA-2 (70B), and converts modality-specific signals to the joint textual space through a pre-trained aligner module. To further strengthen the multimodal LLM's capabilities, we fine-tune the model with a multimodal instruction set manually collected to cover diverse topics and tasks beyond simple QAs. We conduct comprehensive empirical analysis comprising both human and automatic evaluations, and demonstrate state-of-the-art performance on various multimodal tasks.

large language model, natural language, scalable any-modality augmented language model, (2 more...)

arXiv.org Artificial Intelligence

2309.16058

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

Introducing Semantics into Speech Encoders

Xu, Derek, Dong, Shuyan, Wang, Changhan, Kim, Suyoun, Lin, Zhaojiang, Shrivastava, Akshat, Li, Shang-Wen, Tseng, Liang-Hsuan, Baevski, Alexei, Lin, Guan-Ting, Lee, Hung-yi, Sun, Yizhou, Wang, Wei

arXiv.org Artificial IntelligenceNov-15-2022

Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information. As a result, pipelined supervised automatic speech recognition (ASR) to large language model (LLM) systems achieve state-of-the-art results on semantic spoken language tasks by utilizing rich semantic representations from the LLM. These systems come at the cost of labeled audio transcriptions, which is expensive and time-consuming to obtain. We propose a task-agnostic unsupervised way of incorporating semantic information from LLMs into self-supervised speech encoders without labeled audio transcriptions. By introducing semantics, we improve existing speech encoder spoken language understanding performance by over 10\% on intent classification, with modest gains in named entity resolution and slot filling, and spoken question answering FF1 score by over 2\%. Our unsupervised approach achieves similar performance as supervised methods trained on over 100 hours of labeled audio transcripts, demonstrating the feasibility of unsupervised semantic augmentations to existing speech encoders.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2211.08402

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Madotto, Andrea, Lin, Zhaojiang, Winata, Genta Indra, Fung, Pascale

arXiv.org Artificial IntelligenceOct-15-2021

Learning to converse using only a few examples is a great challenge in conversational AI. The current best conversational models, which are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL), are language models (LMs) fine-tuned on large conversational datasets. Training these models is expensive, both in terms of computational resources and time, and it is hard to keep them up to date with new conversational skills. A simple yet unexplored solution is prompt-based few-shot learning (Brown et al. 2020) which does not require gradient-based fine-tuning but instead uses a few examples in the LM context as the only source of learning. In this paper, we explore prompt-based few-shot learning in dialogue tasks. We benchmark LMs of different sizes in nine response generation tasks, which include four knowledge-grounded tasks, a task-oriented generations task, three open-chat tasks, and controlled stylistic generation, and five conversational parsing tasks, which include dialogue state tracking, graph path generation, persona information extraction, document retrieval, and internet query generation. The current largest released LM (GPT-J-6B) using prompt-based few-shot learning, and thus requiring no training, achieves competitive performance to fully trained state-of-the-art models. Moreover, we propose a novel prompt-based few-shot classifier, that also does not require any fine-tuning, to select the most appropriate prompt given a dialogue history. Finally, by combining the power of prompt-based few-shot learning and a Skill Selector, we create an end-to-end chatbot named the Few-Shot Bot (FSB), which automatically selects the most appropriate conversational skill, queries different knowledge bases or the internet, and uses the retrieved knowledge to generate a human-like response, all using only few dialogue examples per skill.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2110.08118

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry:

Retail (0.93)
Leisure & Entertainment (0.67)
Transportation > Ground (0.46)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Models are Few-shot Multilingual Learners

Winata, Genta Indra, Madotto, Andrea, Lin, Zhaojiang, Liu, Rosanne, Yosinski, Jason, Fung, Pascale

arXiv.org Artificial IntelligenceSep-15-2021

General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models.

inductive learning, machine translation, proceedings, (17 more...)

arXiv.org Artificial Intelligence

2109.07684

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue System

Ishii, Etsuko, Xu, Yan, Winata, Genta Indra, Lin, Zhaojiang, Madotto, Andrea, Liu, Zihan, Xu, Peng, Fung, Pascale

arXiv.org Artificial IntelligenceJun-7-2021

Information-seeking dialogue systems, including knowledge identification and response generation, aim to respond to users with fluent, coherent, and informative responses based on users' needs, which. To tackle this challenge, we utilize data augmentation methods and several training techniques with the pre-trained language models to learn a general pattern of the task and thus achieve promising performance. In DialDoc21 competition, our system achieved 74.95 F1 score and 60.74 Exact Match score in subtask 1, and 37.72 SacreBLEU score in subtask 2. Empirical analysis is provided to explain the effectiveness of our approaches.

artificial intelligence, dataset, natural language, (17 more...)

arXiv.org Artificial Intelligence

2106.0353

Country: Europe > Belgium (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)

Add feedback

Continual Learning in Task-Oriented Dialogue Systems

Madotto, Andrea, Lin, Zhaojiang, Zhou, Zhenpeng, Moon, Seungwhan, Crook, Paul, Liu, Bing, Yu, Zhou, Cho, Eunjoon, Wang, Zhiguang

arXiv.org Artificial IntelligenceDec-31-2020

Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-offs between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.

artificial intelligence, natural language, neural network, (13 more...)

arXiv.org Artificial Intelligence

2012.15504

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Plug-and-Play Conversational Models

Madotto, Andrea, Ishii, Etsuko, Lin, Zhaojiang, Dathathri, Sumanth, Fung, Pascale

arXiv.org Artificial IntelligenceOct-8-2020

There has been considerable progress made towards conversational models that generate coherent and fluent responses; however, this often involves training large language models on large dialogue datasets, such as Reddit. These large conversational models provide little control over the generated responses, and this control is further limited in the absence of annotated conversational datasets for attribute specific generation that can be used for fine-tuning the model. In this paper, we first propose and evaluate plug-and-play methods for controllable response generation, which does not require dialogue specific datasets and does not rely on fine-tuning a large model. While effective, the decoding procedure induces considerable computational overhead, rendering the conversational model unsuitable for interactive usage. To overcome this, we introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model. We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes, while being fluent.

artificial intelligence, arxiv preprint arxiv, chatbot, (19 more...)

arXiv.org Artificial Intelligence

2010.04344

Country:

Asia (0.14)
North America > United States (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

Lin, Zhaojiang, Madotto, Andrea, Winata, Genta Indra, Fung, Pascale

arXiv.org Artificial IntelligenceSep-28-2020

In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to "carryover" the old dialogue states to the new one, we introduce Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length. We instantiate our learning framework with two pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ. Extensive experiments demonstrate that: 1) our systems establish new state-of-the-art results on end-to-end response generation, 2) MinTL-based systems are more robust than baseline methods in the low resource setting, and they achieve competitive results with only 20\% training data, and 3) Lev greatly improves the inference efficiency.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2009.12005

Country: Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.81)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback