quintuple
Language Models Struggle to Achieve a Consistent Temporal Representation of Facts
Khodja, Hichem Ammar, Béchet, Frédéric, Brabant, Quentin, Nasr, Alexis, Lecorvé, Gwénolé
Language Models (LMs) have shown substantial improvements in handling factual knowledge, yet their capability to consistently represent temporal facts, which are valid only within specific timeframes, remains underexplored. To investigate this, we introduce TimeStress, a novel dataset comprising 521K statements on 2003 of the most popular temporal facts in Wikidata. Each statement contextualizes a fact with correct and incorrect dates across three precisions (Day, Month, Year). This setup allows us to evaluate LMs' ability to discern between correct and incorrect temporal statements based on their probability of being generated. We assess 18 LMs across various architectures using two metrics: the win rate, indicating how often correct dates outperform incorrect ones, and robustness, reflecting consistent performance across all dates. Our findings reveal that while some LMs achieve a win rate exceeding 80\%, robustness remains low, with the best model achieving only 6\%. Furthermore, robust knowledge at one date precision does not reliably transfer to others, highlighting a significant generalization gap. These results underscore the struggle of LMs to maintain a consistent temporal representation, supporting their limitations as reliable sources of temporal knowledge. We provide all data and code for further research.
- Europe > Austria > Vienna (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Russia (0.04)
- (24 more...)
- Government (1.00)
- Leisure & Entertainment > Sports (0.68)
Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning
Nguyen, Hai-Yen Thi, Nguyen, Cam-Van Thi
Comparative reviews are pivotal in understanding consumer preferences and influencing purchasing decisions. Comparative Quintuple Extraction (COQE) aims to identify five key components in text: the target entity, compared entities, compared aspects, opinions on these aspects, and polarity. Extracting precise comparative information from product reviews is challenging due to nuanced language and sequential task errors in traditional methods. To mitigate these problems, we propose MTP-COQE, an end-to-end model designed for COQE. Leveraging multi-perspective prompt-based learning, MTP-COQE effectively guides the generative model in comparative opinion mining tasks. Evaluation on the Camera-COQE (English) and VCOM (Vietnamese) datasets demonstrates MTP-COQE's efficacy in automating COQE, achieving superior performance with a 1.41% higher F1 score than the previous baseline models on the English dataset. Additionally, we designed a strategy to limit the generative model's creativity to ensure the output meets expectations. We also performed data augmentation to address data imbalance and to prevent the model from becoming biased towards dominant samples.
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.71)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
Large Language Models as Event Forecasters
Key elements of human events are extracted as quadruples that consist of subject, relation, object, and timestamp. This representation can be extended to a quintuple by adding a fifth element: a textual summary that briefly describes the event. These quadruples or quintuples, when organized within a specific domain, form a temporal knowledge graph (TKG). Current learning frameworks focus on a few TKG-related tasks, such as predicting an object given a subject and a relation or forecasting the occurrences of multiple types of events (i.e., relation) in the next time window. They typically rely on complex structural and sequential models like graph neural networks (GNNs) and recurrent neural networks (RNNs) to update intermediate embeddings. However, these methods often neglect the contextual information inherent in each quintuple, which can be effectively captured through concise textual descriptions. In this paper, we investigate how large language models (LLMs) can streamline the design of TKG learning frameworks while maintaining competitive accuracy in prediction and forecasting tasks. We develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder generative LLM. For multi-event forecasting (MEF), we design simple yet effective prompt templates for each TKG quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are subsequently processed by a prediction head with a self-attention mechanism to forecast potential future relations. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness and robustness of our approach.
- North America > United States (0.70)
- Asia > India (0.06)
- Europe > Russia (0.05)
- (7 more...)
- Research Report (0.69)
- Overview (0.48)
Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews
Le, Hoang-Quynh, Can, Duy-Cat, Nguyen, Khanh-Vinh, Tran, Mai-Vu
This paper presents a comprehensive overview of the Comparative Opinion Mining from Vietnamese Product Reviews shared task (ComOM), held as part of the 10$^{th}$ International Workshop on Vietnamese Language and Speech Processing (VLSP 2023). The primary objective of this shared task is to advance the field of natural language processing by developing techniques that proficiently extract comparative opinions from Vietnamese product reviews. Participants are challenged to propose models that adeptly extract a comparative "quintuple" from a comparative sentence, encompassing Subject, Object, Aspect, Predicate, and Comparison Type Label. We construct a human-annotated dataset comprising $120$ documents, encompassing $7427$ non-comparative sentences and $2468$ comparisons within $1798$ sentences. Participating models undergo evaluation and ranking based on the Exact match macro-averaged quintuple F1 score.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
- Overview (0.68)
- Research Report (0.50)
Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework
Le, Ha, Tran, Bao, Le, Phuong, Nguyen, Tan, Nguyen, Dac, Pham, Ngoan, Huynh, Dang
Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, objects, aspects, predicates, and (iii) classifying comparison types which contribute to a deeper comprehension of user sentiments in Vietnamese product reviews. Our method is ranked fifth at the Vietnamese Language and Speech Processing (VLSP) 2023 challenge on Comparative Opinion Mining (ComOM) from Vietnamese Product Reviews.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Vietnam (0.05)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
ComOM at VLSP 2023: A Dual-Stage Framework with BERTology and Unified Multi-Task Instruction Tuning Model for Vietnamese Comparative Opinion Mining
Van Thin, Dang, Hao, Duong Ngoc, Nguyen, Ngan Luu-Thuy
The ComOM shared task aims to extract comparative opinions from product reviews in Vietnamese language. There are two sub-tasks, including (1) Comparative Sentence Identification (CSI) and (2) Comparative Element Extraction (CEE). The first task is to identify whether the input is a comparative review, and the purpose of the second task is to extract the quintuplets mentioned in the comparative review. To address this task, our team proposes a two-stage system based on fine-tuning a BERTology model for the CSI task and unified multi-task instruction tuning for the CEE task. Besides, we apply the simple data augmentation technique to increase the size of the dataset for training our model in the second stage. Experimental results show that our approach outperforms the other competitors and has achieved the top score on the official private test.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
Pre-training Language Models for Comparative Reasoning
Yu, Mengxia, Zhang, Zhihan, Yu, Wenhao, Jiang, Meng
Comparative reasoning is a process of comparing objects, concepts, or entities to draw conclusions, which constitutes a fundamental cognitive ability. In this paper, we propose a novel framework to pre-train language models for enhancing their abilities of comparative reasoning over texts. While there have been approaches for NLP tasks that require comparative reasoning, they suffer from costly manual data labeling and limited generalizability to different tasks. Our approach introduces a novel method of collecting scalable data for text-based entity comparison, which leverages both structured and unstructured data. Moreover, we present a framework of pre-training language models via three novel objectives on comparative reasoning. Evaluation on downstream tasks including comparative question answering, question generation, and summarization shows that our pre-training framework significantly improves the comparative reasoning abilities of language models, especially under low-resource conditions. This work also releases the first integrated benchmark for comparative reasoning.
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (6 more...)
Formal Verification of Safety Architectures for Automated Driving
Eberhart, Clovis, Dubut, Jérémy, Haydon, James, Hasuo, Ichiro
Safety architectures play a crucial role in the safety assurance of automated driving vehicles (ADVs). They can be used as safety envelopes of black-box ADV controllers, and for graceful degradation from one ODD to another. Building on our previous work on the formalization of responsibility-sensitive safety (RSS), we introduce a novel program logic that accommodates assume-guarantee reasoning and fallback-like constructs. This allows us to formally define and prove the safety of existing and novel safety architectures. We apply the logic to a pull over scenario and experimentally evaluate the resulting safety architecture.
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.84)
- Information Technology > Robotics & Automation (0.84)
Exploring and Verbalizing Academic Ideas by Concept Co-occurrence
Xu, Yi, Sheng, Shuqian, Xue, Bo, Fu, Luoyi, Wang, Xinbing, Zhou, Chenghu
Researchers usually come up with new ideas only after thoroughly comprehending vast quantities of literature. The difficulty of this procedure is exacerbated by the fact that the number of academic publications is growing exponentially. In this study, we devise a framework based on concept co-occurrence for academic idea inspiration, which has been integrated into a research assistant system. From our perspective, the fusion of two concepts that co-occur in an academic paper can be regarded as an important way of the emergence of a new idea. We construct evolving concept graphs according to the co-occurrence relationship of concepts from 20 disciplines or topics. Then we design a temporal link prediction method based on masked language model to explore potential connections between different concepts. To verbalize the newly discovered connections, we also utilize the pretrained language model to generate a description of an idea based on a new data structure called co-occurrence citation quintuple. We evaluate our proposed system using both automatic metrics and human assessment. The results demonstrate that our system has broad prospects and can assist researchers in expediting the process of discovering new ideas.
- Asia > China (0.46)
- North America > United States > Texas (0.14)
- Europe > United Kingdom (0.14)
- Africa (0.14)
- Materials > Chemicals (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
- (16 more...)