AITopics | Tetreault, Joel

Collaborating Authors

Tetreault, Joel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CEHA: A Dataset of Conflict Events in the Horn of Africa

Bai, Rui, Lu, Di, Ran, Shihao, Olson, Elizabeth, Lamba, Hemank, Cahill, Aoife, Tetreault, Joel, Jaimes, Alex

arXiv.org Artificial IntelligenceDec-18-2024

Natural Language Processing (NLP) of news articles can play an important role in understanding the dynamics and causes of violent conflict. Despite the availability of datasets categorizing various conflict events, the existing labels often do not cover all of the fine-grained violent conflict event types relevant to areas like the Horn of Africa. In this paper, we introduce a new benchmark dataset Conflict Events in the Horn of Africa region (CEHA) and propose a new task for identifying violent conflict events using online resources with this dataset. The dataset consists of 500 English event descriptions regarding conflict events in the Horn of Africa region with fine-grained event-type definitions that emphasize the cause of the conflict. This dataset categorizes the key types of conflict risk according to specific areas required by stakeholders in the Humanitarian-Peace-Development Nexus. Additionally, we conduct extensive experiments on two tasks supported by this dataset: Event-relevance Classification and Event-type Classification. Our baseline models demonstrate the challenging nature of these tasks and the usefulness of our dataset for model evaluations in low-resource settings with limited number of training data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.13511

Country:

Africa > Kenya (0.46)
North America > Haiti (0.46)
Africa > Sudan (0.29)
Africa > Middle East (0.28)

Genre: Research Report (0.50)

Industry:

Government > Military (0.93)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Media > News (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Lamba, Hemank, Abilov, Anton, Zhang, Ke, Olson, Elizabeth M., Dambanemuya, Henry k., Bárcia, João c., Batista, David S., Wille, Christina, Cahill, Aoife, Tetreault, Joel, Jaimes, Alex

arXiv.org Artificial IntelligenceOct-15-2024

Humanitarian organizations can enhance their effectiveness by analyzing data to discover trends, gather aggregated insights, manage their security risks, support decision-making, and inform advocacy and funding proposals. However, data about violent incidents with direct impact and relevance for humanitarian aid operations is not readily available. An automatic data collection and NLP-backed classification framework aligned with humanitarian perspectives can help bridge this gap. In this paper, we present HumVI - a dataset comprising news articles in three languages (English, French, Arabic) containing instances of different types of violent incidents categorized by the humanitarian sector they impact, e.g., aid security, education, food security, health, and protection. Reliable labels were obtained for the dataset by partnering with a data-backed humanitarian organization, Insecurity Insight. We provide multiple benchmarks for the dataset, employing various deep learning architectures and techniques, including data augmentation and mask loss, to address different task-related challenges, e.g., domain expansion. The dataset is publicly available at https://github.com/dataminr-ai/humvi-dataset.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.0637

Country:

Africa (1.00)
North America (0.93)
Asia > Middle East > Palestine (0.14)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Little Giants: Exploring the Potential of Small LLMs as Evaluation Metrics in Summarization in the Eval4NLP 2023 Shared Task

Kotonya, Neema, Krishnasamy, Saran, Tetreault, Joel, Jaimes, Alejandro

arXiv.org Artificial IntelligenceNov-1-2023

This paper describes and analyzes our participation in the 2023 Eval4NLP shared task, which focuses on assessing the effectiveness of prompt-based techniques to empower Large Language Models to handle the task of quality estimation, particularly in the context of evaluating machine translations and summaries. We conducted systematic experiments with various prompting techniques, including standard prompting, prompts informed by annotator instructions, and innovative chain-of-thought prompting. In addition, we integrated these approaches with zero-shot and one-shot learning methods to maximize the efficacy of our evaluation procedures. Our work reveals that combining these approaches using a "small", open source model (orca_mini_v3_7B) yields competitive results.

artificial intelligence, large language model, natural language, (7 more...)

arXiv.org Artificial Intelligence

2311.00686

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

Belz, Anya, Thomson, Craig, Reiter, Ehud, Abercrombie, Gavin, Alonso-Moral, Jose M., Arvan, Mohammad, Braggaar, Anouck, Cieliebak, Mark, Clark, Elizabeth, van Deemter, Kees, Dinkar, Tanvi, Dušek, Ondřej, Eger, Steffen, Fang, Qixiang, Gao, Mingqi, Gatt, Albert, Gkatzia, Dimitra, González-Corbelle, Javier, Hovy, Dirk, Hürlimann, Manuela, Ito, Takumi, Kelleher, John D., Klubicka, Filip, Krahmer, Emiel, Lai, Huiyuan, van der Lee, Chris, Li, Yiru, Mahamood, Saad, Mieskes, Margot, van Miltenburg, Emiel, Mosteiro, Pablo, Nissim, Malvina, Parde, Natalie, Plátek, Ondřej, Rieser, Verena, Ruan, Jie, Tetreault, Joel, Toral, Antonio, Wan, Xiaojun, Wanner, Leo, Watson, Lewis, Yang, Diyi

arXiv.org Artificial IntelligenceAug-7-2023

We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13\% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, and that all but one of the experiments we selected for reproduction was discovered to have flaws that made the meaningfulness of conducting a reproduction questionable. As a result, we had to change our coordinated study design from a reproduce approach to a standardise-then-reproduce-twice approach. Our overall (negative) finding that the great majority of human evaluations in NLP is not repeatable and/or not reproducible and/or too flawed to justify reproduction, paints a dire picture, but presents an opportunity for a rethink about how to design and report human evaluations in NLP.

artificial intelligence, experiment, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.01633

Country:

Europe (1.00)
North America > United States > Maine (0.14)
North America > United States > Illinois (0.14)
Asia > Japan > Honshū (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.46)

Add feedback

Event Extraction as Question Generation and Answering

Lu, Di, Ran, Shihao, Tetreault, Joel, Jaimes, Alejandro

arXiv.org Artificial IntelligenceJul-9-2023

Recent work on Event Extraction has reframed the task as Question Answering (QA), with promising results. The advantage of this approach is that it addresses the error propagation issue found in traditional token-based classification approaches by directly predicting event arguments without extracting candidates first. However, the questions are typically based on fixed templates and they rarely leverage contextual information such as relevant arguments. In addition, prior QA-based approaches have difficulty handling cases where there are multiple arguments for the same role. In this paper, we propose QGA-EE, which enables a Question Generation (QG) model to generate questions that incorporate rich contextual information instead of using fixed templates. We also propose dynamic templates to assist the training of QG model. Experiments show that QGA-EE outperforms all prior single-task-based models on the ACE05 English dataset.

crime, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2307.05567

Country:

Europe > France (0.14)
Asia > India (0.14)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Government (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.90)

Add feedback

A New Task and Dataset on Detecting Attacks on Human Rights Defenders

Ran, Shihao, Lu, Di, Tetreault, Joel, Cahill, Aoife, Jaimes, Alejandro

arXiv.org Artificial IntelligenceJun-30-2023

The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and summarize the characteristics of attacks on human rights defenders. To that end, we propose a new dataset for detecting Attacks on Human Rights Defenders (HRDsAttack) consisting of crowdsourced annotations on 500 online news articles. The annotations include fine-grained information about the type and location of the attacks, as well as information about the victim(s). We demonstrate the usefulness of the dataset by using it to train and evaluate baseline models on several sub-tasks to predict the annotated characteristics.

hrdsattack, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.17695

Country: Africa > Burundi (0.14)

Genre: Research Report (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Communications > Social Media > Crowdsourcing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics

Ma, Liang, Cao, Shuyang, Logan, Robert L. IV, Lu, Di, Ran, Shihao, Zhang, Ke, Tetreault, Joel, Jaimes, Alejandro

arXiv.org Artificial IntelligenceJun-4-2023

The proliferation of automatic faithfulness metrics for summarization has produced a need for benchmarks to evaluate them. While existing benchmarks measure the correlation with human judgements of faithfulness on model-generated summaries, they are insufficient for diagnosing whether metrics are: 1) consistent, i.e., indicate lower faithfulness as errors are introduced into a summary, 2) effective on human-written texts, and 3) sensitive to different error types (as summaries can contain multiple errors). To address these needs, we present a benchmark of unfaithful minimal pairs (BUMP), a dataset of 889 human-written, minimally different summary pairs, where a single error is introduced to a summary from the CNN/DailyMail dataset to produce an unfaithful summary. We find BUMP complements existing benchmarks in a number of ways: 1) the summaries in BUMP are harder to discriminate and less probable under SOTA summarization models, 2) unlike non-pair-based datasets, BUMP can be used to measure the consistency of metrics, and reveals that the most discriminative metrics tend not to be the most consistent, and 3) unlike datasets containing generated summaries with multiple errors, BUMP enables the measurement of metrics' performance on individual error types.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.09955

Country:

Europe (1.00)
North America > United States > Michigan (0.14)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

XFORMAL: A Benchmark for Multilingual Formality Style Transfer

Briakou, Eleftheria, Lu, Di, Zhang, Ke, Tetreault, Joel

arXiv.org Artificial IntelligenceApr-8-2021

We take the first step towards multilingual style transfer by creating and releasing XFORMAL, a benchmark of multiple formal reformulations of informal text in Brazilian Portuguese, French, and Italian. Results on XFORMAL suggest that state-of-the-art style transfer approaches perform close to simple baselines, indicating that style transfer is even more challenging when moving multilingual.

computational linguistics, machine translation, neural network, (20 more...)

arXiv.org Artificial Intelligence

2104.04108

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(2 more...)

Add feedback

Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest

Radev, Dragomir, Stent, Amanda, Tetreault, Joel, Pappu, Aasish, Iliakopoulou, Aikaterini, Chanfreau, Agustin, de Juan, Paloma, Vallmitjana, Jordi, Jaimes, Alejandro, Jha, Rahul, Mankoff, Bob

arXiv.org Machine LearningJun-26-2015

The New Yorker publishes a weekly captionless cartoon. More than 5,000 readers submit captions for it. The editors select three of them and ask the readers to pick the funniest one. We describe an experiment that compares a dozen automatic methods for selecting the funniest caption. We show that negative sentiment, human-centeredness, and lexical centrality most strongly match the funniest captions, followed by positive sentiment. These results are useful for understanding humor and also in the design of more engaging conversational agents in text and multimodal (vision+text) systems. As part of this work, a large set of cartoons and captions is being made available to the community.

artificial intelligence, caption, natural language, (16 more...)

arXiv.org Machine Learning

1506.08126

Country: North America > United States > New York (0.63)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback