AITopics | Litman, Diane

Plotting

Litman, Diane

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discourse-Driven Evaluation: Unveiling Factual Inconsistency in Long Document Summarization

Zhong, Yang, Litman, Diane

arXiv.org Artificial IntelligenceFeb-10-2025

Detecting factual inconsistency for long document summarization remains challenging, given the complex structure of the source article and long summary length. In this work, we study factual inconsistency errors and connect them with a line of discourse analysis. We find that errors are more common in complex sentences and are associated with several discourse features. We propose a framework that decomposes long texts into discourse-inspired chunks and utilizes discourse information to better aggregate sentence-level scores predicted by natural language inference models. Our approach shows improved performance on top of different model baselines over several evaluation benchmarks, covering rich domains of texts, focusing on long document summarization. This underscores the significance of incorporating discourse features in developing models for scoring summaries for long document factual inconsistency.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.06185

Country:

Asia > Middle East > UAE (0.14)
Europe > Middle East > Malta (0.14)
North America > United States > Louisiana (0.14)
(2 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)

Add feedback

Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI

Asano, Yuya, Hassan, Sabit, Sharma, Paras, Sicilia, Anthony, Atwell, Katherine, Litman, Diane, Alikhani, Malihe

arXiv.org Artificial IntelligenceJan-10-2025

General-purpose automatic speech recognition (ASR) systems do not always perform well in goal-oriented dialogue. Existing ASR correction methods rely on prior user data or named entities. We extend correction to tasks that have no prior user data and exhibit linguistic flexibility such as lexical and syntactic variations. We propose a novel context augmentation with a large language model and a ranking strategy that incorporates contextual information from the dialogue states of a goal-oriented conversational AI and its tasks. Our method ranks (1) n-best ASR hypotheses by their lexical and semantic similarity with context and (2) context by phonetic correspondence with ASR hypotheses. Evaluated in home improvement and cooking domains with real-world users, our method improves recall and F1 of correction by 34% and 16%, respectively, while maintaining precision and false positive rate. Users rated .8-1 point (out of 5) higher when our correction method worked properly, with no decrease due to false positives.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.06129

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.87)

Add feedback

eRevise+RF: A Writing Evaluation System for Assessing Student Essay Revisions and Providing Formative Feedback

Liu, Zhexiong, Litman, Diane, Wang, Elaine, Li, Tianwen, Gobat, Mason, Matsumura, Lindsay Clare, Correnti, Richard

arXiv.org Artificial IntelligenceDec-31-2024

The ability to revise essays in response to feedback is important for students' writing success. An automated writing evaluation (AWE) system that supports students in revising their essays is thus essential. We present eRevise+RF, an enhanced AWE system for assessing student essay revisions (e.g., changes made to an essay to improve its quality in response to essay feedback) and providing revision feedback. We deployed the system with 6 teachers and 406 students across 3 schools in Pennsylvania and Louisiana. The results confirmed its effectiveness in (1) assessing student essays in terms of evidence usage, (2) extracting evidence and reasoning revisions across essays, and (3) determining revision success in responding to feedback. The evaluation also suggested eRevise+RF is a helpful system for young students to improve their argumentative writing skills through revision and formative feedback.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.00715

Country:

North America > United States > Pennsylvania (0.34)
North America > United States > Louisiana (0.24)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Assessment & Standards > Assessment Methods (0.84)
Education > Educational Setting > K-12 Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

Elaraby, Mohamed, Litman, Diane, Li, Xiang Lorraine, Magooda, Ahmed

arXiv.org Artificial IntelligenceJun-19-2024

Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments show that rationale persuasiveness can be improved by controlling its parameters through prompting or through self-refinement.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.13905

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

What metrics of participation balance predict outcomes of collaborative learning with a robot?

Asano, Yuya, Litman, Diane, King-Shepard, Quentin, Maidment, Tristan, Langley, Tyree, Davison, Teresa, Nokes-Malach, Timothy, Kovashka, Adriana, Walker, Erin

arXiv.org Artificial IntelligenceMay-17-2024

One of the keys to the success of collaborative learning is balanced participation by all learners, but this does not always happen naturally. Pedagogical robots have the potential to facilitate balance. However, it remains unclear what participation balance robots should aim at; various metrics have been proposed, but it is still an open question whether we should balance human participation in human-human interactions (HHI) or human-robot interactions (HRI) and whether we should consider robots' participation in collaborative learning involving multiple humans and a robot. This paper examines collaborative learning between a pair of students and a teachable robot that acts as a peer tutee to answer the aforementioned question. Through an exploratory study, we hypothesize which balance metrics in the literature and which portions of dialogues (including vs. excluding robots' participation and human participation in HHI vs. HRI) will better predict learning as a group. We test the hypotheses with another study and replicate them with automatically obtained units of participation to simulate the information available to robots when they adaptively fix imbalances in real-time. Finally, we discuss recommendations on which metrics learning science researchers should choose when trying to understand how to facilitate collaboration.

artificial intelligence, participation, speech recognition, (17 more...)

arXiv.org Artificial Intelligence

2405.11092

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.88)

Industry:

Education > Curriculum > Subject-Specific Education (0.48)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.46)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.36)

Add feedback

ReflectSumm: A Benchmark for Course Reflection Summarization

Zhong, Yang, Elaraby, Mohamed, Litman, Diane, Butt, Ahmed Ashraf, Menekse, Muhsin

arXiv.org Artificial IntelligenceApr-22-2024

This paper introduces ReflectSumm, a novel summarization dataset specifically designed for summarizing students' reflective writing. The goal of ReflectSumm is to facilitate developing and evaluating novel summarization techniques tailored to real-world scenarios with little training data, %practical tasks with potential implications in the opinion summarization domain in general and the educational domain in particular. The dataset encompasses a diverse range of summarization tasks and includes comprehensive metadata, enabling the exploration of various research questions and supporting different applications. To showcase its utility, we conducted extensive evaluations using multiple state-of-the-art baselines. The results provide benchmarks for facilitating further research in this area.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.19012

Country: North America > United States > California (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study (0.88)
Research Report > New Finding (0.66)

Industry:

Education > Educational Setting (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Kennington, Casey, Alikhani, Malihe, Pon-Barry, Heather, Atwell, Katherine, Bisk, Yonatan, Fried, Daniel, Gervits, Felix, Han, Zhao, Inan, Mert, Johnston, Michael, Korpan, Raj, Litman, Diane, Marge, Matthew, Matuszek, Cynthia, Mead, Ross, Mohan, Shiwali, Mooney, Raymond, Parde, Natalie, Sinapov, Jivko, Stewart, Angela, Stone, Matthew, Tellex, Stefanie, Williams, Tom

arXiv.org Artificial IntelligenceApr-1-2024

The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots. The three proposals should act as white papers for any researcher to take and build upon.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.01158

Country:

North America > United States > New York (0.14)
North America > United States > Maryland (0.14)
North America > United States > Illinois (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.64)

Industry: Education > Curriculum (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining

Liu, Zhexiong, Elaraby, Mohamed, Zhong, Yang, Litman, Diane

arXiv.org Artificial IntelligenceOct-24-2023

This paper presents an overview of the ImageArg shared task, the first multimodal Argument Mining shared task co-located with the 10th Workshop on Argument Mining at EMNLP 2023. The shared task comprises two classification subtasks - (1) Subtask-A: Argument Stance Classification; (2) Subtask-B: Image Persuasiveness Classification. The former determines the stance of a tweet containing an image and a piece of text toward a controversial topic (e.g., gun control and abortion). The latter determines whether the image makes the tweet text more persuasive. The shared task received 31 submissions for Subtask-A and 21 submissions for Subtask-B from 9 different teams across 6 countries. The top submission in Subtask-A achieved an F1-score of 0.8647 while the best submission in Subtask-B achieved an F1-score of 0.5561.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.12172

Country:

North America > United States (0.28)
Europe (0.28)

Genre:

Research Report (0.82)
Overview (0.54)

Industry:

Law (0.69)
Health & Medicine > Therapeutic Area (0.52)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

STRONG -- Structure Controllable Legal Opinion Summary Generation

Zhong, Yang, Litman, Diane

arXiv.org Artificial IntelligenceSep-29-2023

We propose an approach for the structure controllable summarization of long legal opinions that considers the argument structure of the document. Our approach involves using predicted argument role information to guide the model in generating coherent summaries that follow a provided structure pattern. We demonstrate the effectiveness of our approach on a dataset of legal opinions and show that it outperforms several strong baselines with respect to ROUGE, BERTScore, and structure similarity.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2309.1728

Country:

Europe (1.00)
North America > Canada (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Law (1.00)
Banking & Finance > Real Estate (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning from Auxiliary Sources in Argumentative Revision Classification

Afrin, Tazin, Litman, Diane

arXiv.org Artificial IntelligenceSep-13-2023

We develop models to classify desirable reasoning revisions in argumentative writing. We explore two approaches -- multi-task learning and transfer learning -- to take advantage of auxiliary sources of revision data for similar tasks. Results of intrinsic and extrinsic evaluations show that both approaches can indeed improve classifier performance over baselines. While multi-task learning shows that training on different sources of data at the same time may improve performance, transfer-learning better represents the relationship between the data.

argumentative revision classification, artificial intelligence, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2309.07334

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.73)

Add feedback