Kielce
Exploring Explanations Improves the Robustness of In-Context Learning
In-context learning (ICL) has emerged as a successful paradigm for leveraging large language models (LLMs). However, it often struggles to generalize beyond the distribution of the provided demonstrations. A recent advancement in enhancing robustness is ICL with explanations (X-ICL), which improves prediction reliability by guiding LLMs to understand and articulate the reasoning behind correct labels. Building on this approach, we introduce an advanced framework that extends X-ICL by systematically exploring explanations for all possible labels (X$^2$-ICL), thereby enabling more comprehensive and robust decision-making. Experimental results on multiple natural language understanding datasets validate the effectiveness of X$^2$-ICL, demonstrating significantly improved robustness to out-of-distribution data compared to the existing ICL approaches.
Beacon: A Naturalistic Driving Dataset During Blackouts for Benchmarking Traffic Reconstruction and Control
Sarker, Supriya, Islam, Iftekharul, Poudel, Bibek, Li, Weizi
Extreme weather events and other vulnerabilities are causing blackouts with increasing frequency, disrupting traffic control systems and posing significant challenges to urban mobility. To address this growing concern, we introduce \model{}, a naturalistic driving dataset collected during blackouts at complex intersections. Beacon provides detailed traffic data from two unsignalized intersections in Memphis, TN, including timesteps, origin, and destination lanes for each vehicle over four hours. We analyze traffic demand, vehicle trajectories, and density across different scenarios. We also use the dataset to reconstruct unsignalized, signalized and mixed traffic conditions, demonstrating its utility for benchmarking traffic reconstruction techniques and control methods. To the best of our knowledge, Beacon could be the first public available traffic dataset that captures naturalistic driving behaviors at complex intersections.
Make a Choice! Knowledge Base Question Answering with In-Context Learning
Tan, Chuanyuan, Chen, Yuehe, Shao, Wenbiao, Chen, Wenliang
Question answering over knowledge bases (KBQA) aims to answer factoid questions with a given knowledge base (KB). Due to the large scale of KB, annotated data is impossible to cover all fact schemas in KB, which poses a challenge to the generalization ability of methods that require a sufficient amount of annotated data. Recently, LLMs have shown strong few-shot performance in many NLP tasks. We expect LLM can help existing methods improve their generalization ability, especially in low-resource situations. In this paper, we present McL-KBQA, a framework that incorporates the few-shot ability of LLM into the KBQA method via ICL-based multiple choice and then improves the effectiveness of the QA tasks. Experimental results on two KBQA datasets demonstrate the competitive performance of McL-KBQA with strong improvements in generalization. We expect to explore a new way to QA tasks from KBQA in conjunction with LLM, how to generate answers normatively and correctly with strong generalization.
Keyword Extraction from Short Texts with a Text-To-Text Transfer Transformer
Pęzik, Piotr, Mikołajczyk-Bareła, Agnieszka, Wawrzyński, Adam, Nitoń, Bartłomiej, Ogrodniczuk, Maciej
The paper explores the relevance of the Text-To-Text Transfer Transformer language model (T5) for Polish (plT5) to the task of intrinsic and extrinsic keyword extraction from short text passages. The evaluation is carried out on the new Polish Open Science Metadata Corpus (POSMAC), which is released with this paper: a collection of 216,214 abstracts of scientific publications compiled in the CURLICAT project. We compare the results obtained by four different methods, i.e. plT5kw, extremeText, TermoPL, KeyBERT and conclude that the plT5kw model yields particularly promising results for both frequent and sparsely represented keywords. Furthermore, a plT5kw keyword generation model trained on the POSMAC also seems to produce highly useful results in cross-domain text labelling scenarios. We discuss the performance of the model on news stories and phone-based dialog transcripts which represent text genres and domains extrinsic to the dataset of scientific abstracts. Finally, we also attempt to characterize the challenges of evaluating a text-to-text model on both intrinsic and extrinsic keyword extraction.