AITopics

2305.14007

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Estonia (0.14)
(16 more...)

Genre:

Financial News (1.00)
Research Report > New Finding (0.86)

Industry:

Government (1.00)
Banking & Finance > Trading (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(2 more...)

Saito, Yuki, Iimori, Eiji, Takamichi, Shinnosuke, Tachibana, Kentaro, Saruwatari, Hiroshi

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center

arXiv.org Artificial IntelligenceMay-23-2023

We present CALLS, a Japanese speech corpus that considers phone calls in a customer center as a new domain of empathetic spoken dialogue. The existing STUDIES corpus covers only empathetic dialogue between a teacher and student in a school. To extend the application range of empathetic dialogue speech synthesis (EDSS), we designed our corpus to include the same female speaker as the STUDIES teacher, acting as an operator in simulated phone calls. We describe a corpus construction methodology and analyze the recorded speech. We also conduct EDSS experiments using the CALLS and STUDIES corpora to investigate the effect of domain differences. The results show that mixing the two corpora during training causes biased improvements in the quality of synthetic speech due to the different degrees of expressiveness. Our project page of the corpus is http://sython.org/Corpus/STUDIES-2.

corpus, machine learning, natural language, (18 more...)

2305.13713

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Europe > Austria > Vienna (0.14)
Asia > South Korea > Incheon > Incheon (0.05)
(6 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.37)

Hung, Chia-Chien, Lange, Lukas, Strötgen, Jannik

TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Intermediate training of pre-trained transformer-based language models on domain-specific data leads to substantial gains for downstream tasks. To increase efficiency and prevent catastrophic forgetting alleviated from full domain-adaptive pre-training, approaches such as adapters have been developed. However, these require additional parameters for each layer, and are criticized for their limited expressiveness. In this work, we introduce TADA, a novel task-agnostic domain adaptation method which is modular, parameter-efficient, and thus, data-efficient. Within TADA, we retrain the embeddings to learn domain-aware input representations and tokenizers for the transformer encoder, while freezing all other parameters of the model. Then, task-specific fine-tuning is performed. We further conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases. Our broad evaluation in 4 downstream tasks for 14 domains across single- and multi-domain setups and high- and low-resource scenarios reveals that TADA is an effective and efficient alternative to full domain-adaptive pre-training and adapters for domain adaptation, while not introducing additional parameters or complex training steps.

computational linguistic, large language model, machine learning, (19 more...)

2305.12717

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Oceania > Australia (0.04)
(14 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Interactive Natural Language Processing

Wang, Zekun, Zhang, Ge, Yang, Kexin, Shi, Ning, Zhou, Wangchunshu, Hao, Shaochun, Xiong, Guangzheng, Li, Yizhi, Sim, Mong Yuan, Chen, Xiuying, Zhu, Qingqing, Yang, Zhenzhu, Nik, Adam, Liu, Qi, Lin, Chenghua, Wang, Shi, Liu, Ruibo, Chen, Wenhu, Xu, Ke, Liu, Dayiheng, Guo, Yike, Fu, Jie

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP.

large language model, machine learning, reinforcement learning, (25 more...)

2305.13246

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)
North America > United States > Washington > King County > Seattle (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.13)
(43 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.67)
Research Report > Promising Solution (0.45)

Industry:

Media (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Education > Curriculum (0.92)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(8 more...)

DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis

Li, Bobo, Fei, Hao, Li, Fei, Wu, Yuhan, Zhang, Jinsong, Wu, Shengqiong, Li, Jingye, Liu, Yijiang, Liao, Lizi, Chua, Tat-Seng, Ji, Donghong

In this paper, we consider filling the gap of ple are scattered around the whole conversation dialogue-level ABSA. We follow the line of recent due to the complex replying structure, which requires quadruple ABSA and present a task of conversational the model to do cross-utterance extraction. DiaASQ sets the goal to detect DiaASQ framework. The task aims to extract the DiaASQ data indicate that our model shows significant three quadruples over the dialog: ('Xiaomi 11', superiority than several strong baselines. 'WiFi module', 'bad design', 'negative'), ('Xiaomi To sum up, this work contributes in threefold: 11', 'battery life', 'not well', 'negative') and ('Xiaomi We pioneer the research of dialogue-level 6', 'screen quality', 'very nice', 'positive'). Specifically, To benchmark the task, we manually annotate a we introduce a conversational aspect-based large-scale DiaASQ dataset.

extraction, machine learning, natural language, (21 more...)

2211.05705

Country: Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Goldfarb-Tarrant, Seraphina, Ross, Björn, Lopez, Adam

Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis

Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with transfer learning using pre-trained models, including multilingual models trained on other languages. In some cases, even supervision data comes from other languages. Does cross-lingual transfer also import new biases? To answer this question, we use counterfactual evaluation to test whether gender or racial biases are imported when using cross-lingual transfer, compared to a monolingual transfer setting. Across five languages, we find that systems using cross-lingual transfer usually become more biased than their monolingual counterparts. We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code.

computational linguistic, machine learning, natural language, (17 more...)

2305.12709

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Hong Kong (0.04)
South America > Chile > Maule Region > Talca Province > Talca (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.71)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)

Improving Classifier Robustness through Active Generation of Pairwise Counterfactuals

Balashankar, Ananth, Wang, Xuezhi, Qin, Yao, Packer, Ben, Thain, Nithum, Chen, Jilin, Chi, Ed H., Beutel, Alex

Counterfactual Data Augmentation (CDA) is a commonly used technique for improving robustness in natural language classifiers. However, one fundamental challenge is how to discover meaningful counterfactuals and efficiently label them, with minimal human labeling cost. Most existing methods either completely rely on human-annotated labels, an expensive process which limits the scale of counterfactual data, or implicitly assume label invariance, which may mislead the model with incorrect labels. In this paper, we present a novel framework that utilizes counterfactual generative models to generate a large number of diverse counterfactuals by actively sampling from regions of uncertainty, and then automatically label them with a learned pairwise classifier. Our key insight is that we can more correctly label the generated counterfactuals by training a pairwise classifier that interpolates the relationship between the original example and the counterfactual. We demonstrate that with a small amount of human-annotated counterfactual data (10%), we can generate a counterfactual augmentation dataset with learned labels, that provides an 18-20% improvement in robustness and a 14-21% reduction in errors on 6 out-of-domain datasets, comparable to that of a fully human-annotated counterfactual dataset for both sentiment classification and question paraphrase tasks.

machine learning, natural language, text classification, (18 more...)

2305.13535

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)

Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision

Cai, Yucheng, Liu, Hong, Ou, Zhijian, Huang, Yi, Feng, Junlan

Most existing task-oriented dialog (TOD) systems track dialog states in terms of slots and values and use them to query a database to get relevant knowledge to generate responses. In real-life applications, user utterances are noisier, and thus it is more difficult to accurately track dialog states and correctly secure relevant knowledge. Recently, a progress in question answering and document-grounded dialog systems is retrieval-augmented methods with a knowledge retriever. Inspired by such progress, we propose a retrieval-based method to enhance knowledge selection in TOD systems, which significantly outperforms the traditional database query method for real-life dialogs. Further, we develop latent variable model based semi-supervised learning, which can work with the knowledge retriever to leverage both labeled and unlabeled dialog data. Joint Stochastic Approximation (JSA) algorithm is employed for semi-supervised model training, and the whole system is referred to as that JSA-KRTOD. Experiments are conducted on a real-life dataset from China Mobile Custom-Service, called MobileCS, and show that JSA-KRTOD achieves superior performances in both labeled-only and semi-supervised settings.

artificial intelligence, machine learning, natural language, (18 more...)

2305.13199

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (1.00)

Industry: Telecommunications (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-21-2023

EM Pre-training for Multi-party Dialogue Response Generation

Li, Yiyang, Zhao, Hai

Dialogue response generation requires an agent to generate a response according to the current dialogue history, in terms of which two-party dialogues have been well studied, but leaving a great gap for multi-party dialogues at the same time. Different from two-party dialogues where each response is a direct reply to its previous utterance, the addressee of a response utterance should be specified before it is generated in the multi-party scenario. Thanks to the huge amount of two-party conversational data, various pre-trained language models for two-party dialogue response generation have been proposed. However, due to the lack of annotated addressee labels in multi-party dialogue datasets, it is hard to use them to pre-train a response generation model for multi-party dialogues. To tackle this obstacle, we propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels, and the maximization steps to optimize a response generation model. Theoretical analyses and extensive experiments have justified the feasibility and effectiveness of our proposed method.

computational linguistic, large language model, machine learning, (16 more...)

2305.12412

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
(10 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.37)

arXiv.org Artificial IntelligenceMay-21-2023

Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

Sun, Licai, Lian, Zheng, Liu, Bin, Tao, Jianhua

With the proliferation of user-generated online videos, Multimodal Sentiment Analysis (MSA) has attracted increasing attention recently. Despite significant progress, there are still two major challenges on the way towards robust MSA: 1) inefficiency when modeling cross-modal interactions in unaligned multimodal data; and 2) vulnerability to random modality feature missing which typically occurs in realistic settings. In this paper, we propose a generic and unified framework to address them, named Efficient Multimodal Transformer with Dual-Level Feature Restoration (EMT-DLFR). Concretely, EMT employs utterance-level representations from each modality as the global multimodal context to interact with local unimodal features and mutually promote each other. It not only avoids the quadratic scaling cost of previous local-local cross-modal interaction methods but also leads to better performance. To improve model robustness in the incomplete modality setting, on the one hand, DLFR performs low-level feature reconstruction to implicitly encourage the model to learn semantic information from incomplete data. On the other hand, it innovatively regards complete and incomplete data as two different views of one sample and utilizes siamese representation learning to explicitly attract their high-level representations. Comprehensive experiments on three popular datasets demonstrate that our method achieves superior performance in both complete and incomplete modality settings.

artificial intelligence, natural language, robust multimodal sentiment analysis, (2 more...)

doi: 10.1109/TAFFC.2023.3274829

2208.07589

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.60)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.60)