AITopics | Carenini, Giuseppe

Plotting

Carenini, Giuseppe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Visual Analytics for Generative Transformer Models

Li, Raymond, Yang, Ruixin, Xiao, Wen, AbuRaed, Ahmed, Murray, Gabriel, Carenini, Giuseppe

arXiv.org Artificial IntelligenceNov-21-2023

While transformer-based models have achieved state-of-the-art results in a variety of classification and generation tasks, their black-box nature makes them challenging for interpretability. In this work, we present a novel visual analytical framework to support the analysis of transformer-based generative networks. In contrast to previous work, which has mainly focused on encoder-based models, our framework is one of the first dedicated to supporting the analysis of transformer-based encoder-decoder models and decoder-only models for generative and classification tasks. Hence, we offer an intuitive overview that allows the user to explore different facets of the model through interactive visualization. To demonstrate the feasibility and usefulness of our framework, we present three detailed case studies based on real-world NLP research problems.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2311.12418

Country:

Europe (1.00)
North America > Canada > British Columbia (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models

Li, Raymond, Murray, Gabriel, Carenini, Giuseppe

arXiv.org Artificial IntelligenceOct-24-2023

In this work, we propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models in the parameter-efficient fine-tuning (PEFT) setting. In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture, where Gumbel-Softmax gates are used to determine the importance of these modules at each layer of the model. To reduce the number of parameters, we first train the model for a fixed small number of steps before pruning the experts based on their importance scores. Our experiment results with three different pre-trained models show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters. In addition, we provide additional analysis to examine the experts selected by each model at each layer to provide insights for future studies.

interpreting pre-trained language model, machine learning, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.1624

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues

Li, Chuyuan, Huber, Patrick, Xiao, Wen, Amblard, Maxime, Braud, Chloé, Carenini, Giuseppe

arXiv.org Artificial IntelligenceJun-25-2023

Discourse processing suffers from data sparsity, especially for dialogues. As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs). We investigate multiple tasks for fine-tuning and show that the dialogue-tailored Sentence Ordering task performs best. To locate and exploit discourse information in PLMs, we propose an unsupervised and a semi-supervised method. Our proposals achieve encouraging results on the STAC corpus, with F1 scores of 57.2 and 59.3 for unsupervised and semi-supervised methods, respectively. When restricted to projective trees, our scores improved to 63.3 and 68.1.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.05895

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Diversity-Aware Coherence Loss for Improving Neural Topic Models

Li, Raymond, González-Pizarro, Felipe, Xing, Linzi, Murray, Gabriel, Carenini, Giuseppe

arXiv.org Artificial IntelligenceMay-26-2023

The standard approach for neural topic modeling uses a variational autoencoder (VAE) framework that jointly minimizes the KL divergence between the estimated posterior and prior, in addition to the reconstruction loss. Since neural topic models are trained by recreating individual input documents, they do not explicitly capture the coherence between topic words on the corpus level. In this work, we propose a novel diversity-aware coherence loss that encourages the model to learn corpus-level coherence scores while maintaining a high diversity between topics. Experimental results on multiple datasets show that our method significantly improves the performance of neural topic models without requiring any pretraining or additional parameters.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.16199

Country:

Europe (1.00)
Asia (0.94)
North America > United States > California (0.28)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ChatGPT-steered Editing Instructor for Customization of Abstractive Summarization

Xiao, Wen, Xie, Yujia, Carenini, Giuseppe, He, Pengcheng

arXiv.org Artificial IntelligenceMay-3-2023

Tailoring outputs of large language models, such as ChatGPT, to specific user needs remains a challenge despite their impressive generation quality. In this paper, we propose a tri-agent generation pipeline consisting of a generator, an instructor, and an editor to enhance the customization of generated outputs. The generator produces an initial output, the user-specific instructor generates editing instructions, and the editor generates a revised output aligned with user preferences. The inference-only large language model (ChatGPT) serves as both the generator and the editor, while a smaller model acts as the user-specific instructor to guide the generation process toward user needs. The instructor is trained using editor-steered reinforcement learning, leveraging feedback from the large-scale editor model to optimize instruction generation. Experimental results on two abstractive summarization datasets demonstrate the effectiveness of our approach in generating outputs that better fulfill user expectations.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.02483

Country:

North America > United States > Utah (0.14)
North America > Canada > British Columbia (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NL4Opt Competition: Formulating Optimization Problems Based on Their Natural Language Descriptions

Ramamonjison, Rindranirina, Yu, Timothy T., Li, Raymond, Li, Haley, Carenini, Giuseppe, Ghaddar, Bissan, He, Shiqi, Mostajabdaveh, Mahdi, Banitalebi-Dehkordi, Amin, Zhou, Zirui, Zhang, Yong

arXiv.org Artificial IntelligenceMar-26-2023

The Natural Language for Optimization (NL4Opt) Competition was created to investigate methods of extracting the meaning and formulation of an optimization problem based on its text description. Specifically, the goal of the competition is to increase the accessibility and usability of optimization solvers by allowing non-experts to interface with them using natural language. We separate this challenging goal into two sub-tasks: (1) recognize and label the semantic entities that correspond to the components of the optimization problem; (2) generate a meaning representation (i.e., a logical form) of the problem from its detected problem entities. The first task aims to reduce ambiguity by detecting and tagging the entities of the optimization problems. The second task creates an intermediate representation of the linear programming (LP) problem that is converted into a format that can be used by commercial solvers. In this report, we present the LP word problem dataset and shared tasks for the NeurIPS 2022 competition. Furthermore, we investigate and compare the performance of the ChatGPT large language model against the winning solutions. Through this competition, we hope to bring interest towards the development of novel machine learning applications and datasets for optimization modeling.

artificial intelligence, natural language, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2303.08233

Country:

Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.14)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Attend to the Right Context: A Plug-and-Play Module for Content-Controllable Summarization

Xiao, Wen, Miculicich, Lesly, Liu, Yang, He, Pengcheng, Carenini, Giuseppe

arXiv.org Artificial IntelligenceDec-21-2022

Content-Controllable Summarization generates summaries focused on the given controlling signals. Due to the lack of large-scale training corpora for the task, we propose a plug-and-play module RelAttn to adapt any general summarizers to the content-controllable summarization task. RelAttn first identifies the relevant content in the source documents, and then makes the model attend to the right context by directly steering the attention weight. We further apply an unsupervised online adaptive parameter searching algorithm to determine the degree of control in the zero-shot setting, while such parameters are learned in the few-shot setting. By applying the module to three backbone summarization models, experiments show that our method effectively improves all the summarizers, and outperforms the prefix-based method and a widely used plug-and-play model in both zero- and few-shot settings. Tellingly, more benefit is observed in the scenarios when more control is needed.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.10819

Country:

Europe (0.94)
North America > United States > Maryland (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback

Large Discourse Treebanks from Scalable Distant Supervision

Huber, Patrick, Carenini, Giuseppe

arXiv.org Artificial IntelligenceOct-17-2022

Discourse parsing is an essential upstream task in Natural Language Processing with strong implications for many real-world applications. Despite its widely recognized role, most recent discourse parsers (and consequently downstream tasks) still rely on small-scale human-annotated discourse treebanks, trying to infer general-purpose discourse structures from very limited data in a few narrow domains. To overcome this dire situation and allow discourse parsers to be trained on larger, more diverse and domain-independent datasets, we propose a framework to generate "silver-standard" discourse trees from distant supervision on the auxiliary task of sentiment analysis.

artificial intelligence, discourse, natural language, (13 more...)

arXiv.org Artificial Intelligence

2212.06038

Country: North America > Canada (0.29)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.94)

Add feedback

Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation

Huber, Patrick, Xing, Linzi, Carenini, Giuseppe

arXiv.org Artificial IntelligenceDec-12-2021

RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents. Despite its importance, one of the most prevailing limitations in modern day discourse parsing is the lack of large-scale datasets. To overcome the data sparsity issue, distantly supervised approaches from tasks like sentiment analysis and summarization have been recently proposed. Here, we extend this line of research by exploiting distant supervision from topic segmentation, which can arguably provide a strong and oftentimes complementary signal for high-level discourse structures. Experiments on two human-annotated discourse treebanks confirm that our proposal generates accurate tree structures on sentence and paragraph level, consistently outperforming previous distantly supervised models on the sentence-to-document task and occasionally reaching even higher scores on the sentence-to-paragraph level.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2112.06196

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.93)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Human Interpretation and Exploitation of Self-attention Patterns in Transformers: A Case Study in Extractive Summarization

Li, Raymond, Xiao, Wen, Wang, Lanjun, Carenini, Giuseppe

arXiv.org Artificial IntelligenceDec-10-2021

The transformer multi-head self-attention mechanism has been thoroughly investigated recently. On one hand, researchers are interested in understanding why and how transformers work. On the other hand, they propose new attention augmentation methods to make transformers more accurate, efficient and interpretable. In this paper, we synergize these two lines of research in a human-in-the-loop pipeline to first find important task-specific attention patterns. Then those patterns are applied, not only to the original model, but also to smaller models, as a human-guided knowledge distillation process. The benefits of our pipeline are demonstrated in a case study with the extractive summarization task. After finding three meaningful attention patterns in the popular BERTSum model, experiments indicate that when we inject such patterns, both the original and the smaller model show improvements in performance and arguably interpretability.

computational linguistic, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2112.05364

Country:

Europe (0.94)
Asia > China (0.28)
North America > Canada > British Columbia (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback