AITopics

doi: 10.1145/3539618.3592092

2306.01579

Country:

Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Arakelyan, Erik, Arora, Arnav, Augenstein, Isabelle

Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection

arXiv.org Machine LearningJun-1-2023

Stance Detection is concerned with identifying the attitudes expressed by an author towards a target of interest. This task spans a variety of domains ranging from social media opinion identification to detecting the stance for a legal claim. However, the framing of the task varies within these domains, in terms of the data collection protocol, the label dictionary and the number of available annotations. Furthermore, these stance annotations are significantly imbalanced on a per-topic and inter-topic basis. These make multi-domain stance detection a challenging task, requiring standardization and domain adaptation. To overcome this challenge, we propose $\textbf{T}$opic $\textbf{E}$fficient $\textbf{St}$anc$\textbf{E}$ $\textbf{D}$etection (TESTED), consisting of a topic-guided diversity sampling technique and a contrastive objective that is used for fine-tuning a stance classifier. We evaluate the method on an existing benchmark of $16$ datasets with in-domain, i.e. all topics seen and out-of-domain, i.e. unseen topics, experiments. The results show that our method outperforms the state-of-the-art with an average of $3.5$ F1 points increase in-domain, and is more generalizable with an averaged increase of $10.2$ F1 on out-of-domain evaluation while using $\leq10\%$ of the training data. We show that our sampling technique mitigates both inter- and per-topic class imbalances. Finally, our analysis demonstrates that the contrastive learning objective allows the model a more pronounced segmentation of samples with varying labels.

computational linguistic, machine learning, natural language, (14 more...)

arXiv.org Machine Learning

doi: 10.18653/v1/2023.acl-long.752

2306.00765

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(28 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Government > Regional Government > North America Government > United States Government (0.69)
Law (0.67)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)

UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis

Hu, Dou, Wei, Lingwei, Liu, Yaxin, Zhou, Wei, Hu, Songlin

This paper describes our system designed for SemEval-2023 Task 12: Sentiment analysis for African languages. The challenge faced by this task is the scarcity of labeled data and linguistic resources in low-resource settings. To alleviate these, we propose a generalized multilingual system SACL-XLMR for sentiment analysis on low-resource languages. Specifically, we design a lexicon-based multilingual BERT to facilitate language adaptation and sentiment-aware representation learning. Besides, we apply a supervised adversarial contrastive learning technique to learn sentiment-spread structured representations and enhance model generalization. Our system achieved competitive results, largely outperforming baselines on both multilingual and zero-shot sentiment classification subtasks. Notably, the system obtained the 1st rank on the zero-shot classification subtask in the official ranking. Extensive experiments demonstrate the effectiveness of our system.

artificial intelligence, natural language, subtask, (16 more...)

2306.01093

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Africa > North Africa (0.14)
(15 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Keizer, Simon, Dockes, Caroline, Braunschweiler, Norbert, Stoyanchev, Svetlana, Doddipatla, Rama

Adversarial learning of neural user simulators for dialogue policy optimisation

Reinforcement learning based dialogue policies are typically trained in interaction with a user simulator. To obtain an effective and robust policy, this simulator should generate user behaviour that is both realistic and varied. Current data-driven simulators are trained to accurately model the user behaviour in a dialogue corpus. We propose an alternative method using adversarial learning, with the aim to simulate realistic user behaviour with more variation. We train and evaluate several simulators on a corpus of restaurant search dialogues, and then use them to train dialogue system policies. In policy cross-evaluation experiments we demonstrate that an adversarially trained simulator produces policies with 8.3% higher success rate than those trained with a maximum likelihood simulator. Subjective results from a crowd-sourced dialogue system user evaluation confirm the effectiveness of adversarially training user simulators.

machine learning, natural language, reinforcement learning, (17 more...)

2306.00858

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(6 more...)

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
(2 more...)

In-Context Learning User Simulators for Task-Oriented Dialog Systems

Terragni, Silvia, Filipavicius, Modestas, Khau, Nghia, Guedes, Bruna, Manso, André, Mathis, Roland

This paper presents a novel application of large language models in user simulation for task-oriented dialog systems, specifically focusing on an in-context learning approach. By harnessing the power of these models, the proposed approach generates diverse utterances based on user goals and limited dialog examples. Unlike traditional simulators, this method eliminates the need for labor-intensive rule definition or extensive annotated data, making it more efficient and accessible. Additionally, an error analysis of the interaction between the user simulator and dialog system uncovers common mistakes, providing valuable insights into areas that require improvement. Our implementation is available at https://github.com/telepathylabsai/prompt-based-user-simulator.

large language model, machine learning, natural language, (20 more...)

2306.00774

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Consumer Products & Services > Restaurants (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

Wang, Qingyue, Ding, Liang, Cao, Yanan, Zhan, Yibing, Lin, Zheng, Wang, Shi, Tao, Dacheng, Guo, Li

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data. Existing works mainly study common data- or model-level augmentation methods to enhance the generalization but fail to effectively decouple the semantics of samples, limiting the zero-shot performance of DST. In this paper, we present a simple and effective "divide, conquer and combine" solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. Specifically, we divide the seen data into semantically independent subsets and train corresponding experts, the newly unseen samples are mapped and inferred with mixture-of-experts with our designed ensemble inference. Extensive experiments on MultiWOZ2.1 upon the T5-Adapter show our schema significantly and consistently improves the zero-shot performance, achieving the SOTA on settings without external knowledge, with only 10M trainable parameters1.

large language model, machine learning, natural language, (19 more...)

2306.00434

Country:

Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-31-2023

AoM: Detecting Aspect-oriented Information for Multimodal Aspect-Based Sentiment Analysis

Zhou, Ru, Guo, Wenya, Liu, Xumeng, Yu, Shenglong, Zhang, Ying, Yuan, Xiaojie

Multimodal aspect-based sentiment analysis (MABSA) aims to extract aspects from text-image pairs and recognize their sentiments. Existing methods make great efforts to align the whole image to corresponding aspects. However, different regions of the image may relate to different aspects in the same sentence, and coarsely establishing image-aspect alignment will introduce noise to aspect-based sentiment analysis (i.e., visual noise). Besides, the sentiment of a specific aspect can also be interfered by descriptions of other aspects (i.e., textual noise). Considering the aforementioned noises, this paper proposes an Aspect-oriented Method (AoM) to detect aspect-relevant semantic and sentiment information. Specifically, an aspect-aware attention module is designed to simultaneously select textual tokens and image blocks that are semantically related to the aspects. To accurately aggregate sentiment information, we explicitly introduce sentiment embedding into AoM, and use a graph convolutional network to model the vision-text and text-text interaction. Extensive experiments demonstrate the superiority of AoM to existing methods. The source code is publicly released at https://github.com/SilyRab/AoM.

information, machine learning, natural language, (18 more...)

2306.01004

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.08)
North America > United States > Oklahoma > Oklahoma County > Oklahoma City (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
(5 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Bang, Namo, Lee, Jeehyun, Koo, Myoung-Wan

Task-Optimized Adapters for an End-to-End Task-Oriented Dialogue System

arXiv.org Artificial IntelligenceMay-31-2023

Task-Oriented Dialogue (TOD) systems are designed to carry out specific tasks by tracking dialogue states and generating appropriate responses to help users achieve defined goals. Recently, end-to-end dialogue models pre-trained based on large datasets have shown promising performance in the conversational system. However, they share the same parameters to train tasks of the dialogue system (NLU, DST, NLG), so debugging each task is challenging. Also, they require a lot of effort to fine-tune large parameters to create a task-oriented chatbot, making it difficult for non-experts to handle. Therefore, we intend to train relatively lightweight and fast models compared to PLM. In this paper, we propose an End-to-end TOD system with Task-Optimized Adapters which learn independently per task, adding only small number of parameters after fixed layers of pre-trained network. We also enhance the performance of the DST and NLG modules through reinforcement learning, overcoming the learning curve that has lacked at the adapter learning and enabling the natural and consistent response generation that is appropriate for the goal. Our method is a model-agnostic approach and does not require prompt-tuning as only input data without a prompt. As results of the experiment, our method shows competitive performance on the MultiWOZ benchmark compared to the existing end-to-end models. In particular, we attain state-of-the-art performance on the DST task of 2.2 dataset.

computational linguistic, machine learning, reinforcement learning, (14 more...)

2305.02468

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(19 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Ganguly, Srinjoy, Morapakula, Sai Nandan, Coronado, Luis Miguel Pozo

Quantum Natural Language Processing based Sentiment Analysis using lambeq Toolkit

arXiv.org Artificial IntelligenceMay-30-2023

Sentiment classification is one the best use case of classical natural language processing (NLP) where we can witness its power in various daily life domains such as banking, business and marketing industry. We already know how classical AI and machine learning can change and improve technology. Quantum natural language processing (QNLP) is a young and gradually emerging technology which has the potential to provide quantum advantage for NLP tasks. In this paper we show the first application of QNLP for sentiment analysis and achieve perfect test set accuracy for three different kinds of simulations and a decent accuracy for experiments ran on a noisy quantum device. We utilize the lambeq QNLP toolkit and $t|ket>$ by Cambridge Quantum (Quantinuum) to bring out the results.

artificial intelligence, natural language, string diagram, (12 more...)

2305.19383

Country:

Europe > Spain > Galicia > Madrid (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > India (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.92)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)

arXiv.org Artificial IntelligenceMay-29-2023

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

Ma, Mingyu Derek, Kao, Jiun-Yu, Gao, Shuyang, Gupta, Arpit, Jin, Di, Chung, Tagyoung, Peng, Nanyun

The computing and data resource-hungry Dialogue state tracking (DST) that extracts structured issues are more severe in the real-world deployment conversation progress in a list of slot-value where LMs tuned for different domains and pairs from unstructured dialogue utterances is an essential tasks need to be trained and hosted, and a typical component of a dialogue system (Wang and dialogue system has to serve dozens of such LMs Lemon, 2013). Unlike classification-based models (Maronikolakis and Schütze, 2021; Strubell et al., that pick the slot value from given candidate (Ye 2019; Lacoste et al., 2019). This leads to a high cost et al., 2021; Chen et al., 2020), recent works formulate of the development and service of dialogue systems DST as a conditional generation task (Gao and constrains offline deployment. In addition, limited et al., 2019; Lin et al., 2020), where the concatenation data is available for a new domain or task. of dialogue history and a slot-specific prompt We propose a parameter-efficient and dataefficient are fed to generative models and the text generation DST model for low-resource settings, output are decoded to predicted slot values (Ham which only needs to update 0.08% of parameters et al., 2020; Hosseini-Asl et al., 2020). This formulation compared with the previous best model, by enjoys the benefit of generalizability to keeping LM parameters frozen and introducing unseen domains and slot types beyond a defined dialogue soft prompt tokens to represent task properties ontology (Li et al., 2021; Peng et al., 2021). of different slots. Figure 1 gives an overview of General prompting methods use a textual prompt our model. The only prior work we are aware of to provide task information to the LM (Liu et al., that only updates prompt token embeddings and 2021; Ma et al., 2023b). Prior works have variations thus parameter-efficient is Zhu et al. (2022), but that update different parameter combinations it focuses on continual domain adaptation and with such as both LM and prompt token embeddings a significant amount of training data. Work done while at Amazon.

artificial intelligence, computational linguistic, natural language, (18 more...)

2301.10915

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Overview (0.88)
Research Report (0.64)

Industry:

Consumer Products & Services (0.96)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)