annotation scheme
Mitigating Semantic Drift: Evaluating LLMs' Efficacy in Psychotherapy through MI Dialogue Summarization
Kumar, Vivek, Rajawat, Pushpraj Singh, Ntoutsi, Eirini
Recent advancements in large language models (LLMs) have shown their potential across both general and domain-specific tasks. However, there is a growing concern regarding their lack of sensitivity, factual incorrectness in responses, inconsistent expressions of empathy, bias, hallucinations, and overall inability to capture the depth and complexity of human understanding, especially in low-resource and sensitive domains such as psychology. To address these challenges, our study employs a mixed-methods approach to evaluate the efficacy of LLMs in psychotherapy. We use LLMs to generate precise summaries of motivational interviewing (MI) dialogues and design a two-stage annotation scheme based on key components of the Motivational Interviewing Treatment Integrity (MITI) framework, namely evocation, collaboration, autonomy, direction, empathy, and a non-judgmental attitude. Using expert-annotated MI dialogues as ground truth, we formulate multi-class classification tasks to assess model performance under progressive prompting techniques, incorporating one-shot and few-shot prompting. Our results offer insights into LLMs' capacity for understanding complex psychological constructs and highlight best practices to mitigate ``semantic drift" in therapeutic settings. Our work contributes not only to the MI community by providing a high-quality annotated dataset to address data scarcity in low-resource domains but also critical insights for using LLMs for precise contextual interpretation in complex behavioral therapy.
The InviTE Corpus: Annotating Invectives in Tudor English Texts for Computational Modeling
Spliethoff, Sophie, Hoeken, Sanne, Schwandt, Silke, Zarrieß, Sina, Alaçam, Özge
In this paper, we aim at the application of Natural Language Processing (NLP) techniques to historical research endeavors, particularly addressing the study of religious invectives in the context of the Protestant Reformation in Tudor England. We outline a workflow spanning from raw data, through pre-processing and data selection, to an iterative annotation process. As a result, we introduce the InviTE corpus -- a corpus of almost 2000 Early Modern English (EModE) sentences, which are enriched with expert annotations regarding invective language throughout 16th-century England. Subsequently, we assess and compare the performance of fine-tuned BERT-based models and zero-shot prompted instruction-tuned large language models (LLMs), which highlights the superiority of models pre-trained on historical data and fine-tuned to invective detection.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Germany (0.04)
- (5 more...)
A Joint Multitask Model for Morpho-Syntactic Parsing
Inostroza, Demian, Mistica, Mel, Vylomova, Ekaterina, Guest, Chris, Kurniawan, Kemal
We present a joint multitask model for the UniDive 2025 Morpho-Syntactic Parsing shared task, where systems predict both morphological and syntactic analyses following novel UD annotation scheme. Our system uses a shared XLM-RoBERTa encoder with three specialized decoders for content word identification, dependency parsing, and morphosyntactic feature prediction. Our model achieves the best overall performance on the shared task's leaderboard covering nine typologically diverse languages, with an average MSLAS score of 78.7 percent, LAS of 80.1 percent, and Feats F1 of 90.3 percent. Our ablation studies show that matching the task's gold tokenization and content word identification are crucial to model performance. Error analysis reveals that our model struggles with core grammatical cases (particularly Nom-Acc) and nominal features across languages.
A Decomposition-Based Approach for Evaluating and Analyzing Inter-Annotator Disagreement
We propose a novel method to conceptually decompose an existing annotation into separate levels, allowing the analysis of inter-annotators disagreement in each level separately. We suggest two distinct strategies in order to actualize this approach: a theoretically-driven one, in which the researcher defines a decomposition based on prior knowledge of the annotation task, and an exploration-based one, in which many possible decompositions are inductively computed and presented to the researcher for interpretation and evaluation. Utilizing a recently constructed dataset for narrative analysis as our use-case, we apply each of the two strategies to demonstrate the potential of our approach in testing hypotheses regarding the sources of annotation disagreements, as well as revealing latent structures and relations within the annotation task. We conclude by suggesting how to extend and generalize our approach, as well as use it for other purposes.
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.40)
- North America > United States > Washington > King County > Seattle (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
A Systematic Comparison of Syntactic Representations of Dependency Parsing
Wisniewski, Guillaume, Lacroix, Ophélie
We compare the performance of a transition-based parser in regards to different annotation schemes. We pro-pose to convert some specific syntactic constructions observed in the universal dependency treebanks into a so-called more standard representation and to evaluate parsing performances over all the languages of the project. We show that the ``standard'' constructions do not lead systematically to better parsing performance and that the scores vary considerably according to the languages.
- Europe > Czechia > Prague (0.05)
- Europe > Sweden > Uppsala County > Uppsala (0.05)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- (9 more...)
GiesKaNe: Bridging Past and Present in Grammatical Theory and Practical Application
This article explores the requirements for corpus compilation within the GiesKaNe project (University of Giessen and Kassel, Syntactic Basic Structures of New High German). The project is defined by three central characteristics: it is a reference corpus, a historical corpus, and a syntactically deeply annotated treebank. As a historical corpus, GiesKaNe aims to establish connections with both historical and contemporary corpora, ensuring its relevance across temporal and linguistic contexts. The compilation process strikes the balance between innovation and adherence to standards, addressing both internal project goals and the broader interests of the research community. The methodological complexity of such a project is managed through a complementary interplay of human expertise and machine-assisted processes. The article discusses foundational topics such as tokenization, normalization, sentence definition, tagging, parsing, and inter-annotator agreement, alongside advanced considerations. These include comparisons between grammatical models, annotation schemas, and established de facto annotation standards as well as the integration of human and machine collaboration. Notably, a novel method for machine-assisted classification of texts along the continuum of conceptual orality and literacy is proposed, offering new perspectives on text selection. Furthermore, the article introduces an approach to deriving de facto standard annotations from existing ones, mediating between standardization and innovation. In the course of describing the workflow the article demonstrates that even ambitious projects like GiesKaNe can be effectively implemented using existing research infrastructure, requiring no specialized annotation tools. Instead, it is shown that the workflow can be based on the strategic use of a simple spreadsheet and integrates the capabilities of the existing infrastructure.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (54 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Comparative Analysis of Extrinsic Factors for NER in French
Yang, Grace, Li, Zhiyi, Liu, Yadong, Park, Jungyeul
Named entity recognition (NER) is a crucial task that aims to identify structured information, which is often replete with complex, technical terms and a high degree of variability. Accurate and reliable NER can facilitate the extraction and analysis of important information. However, NER for other than English is challenging due to limited data availability, as the high expertise, time, and expenses are required to annotate its data. In this paper, by using the limited data, we explore various factors including model structure, corpus annotation scheme and data augmentation techniques to improve the performance of a NER model for French. Our experiments demonstrate that these approaches can significantly improve the model's F1 score from original CRF score of 62.41 to 79.39. Our findings suggest that considering different extrinsic factors and combining these techniques is a promising approach for improving NER performance where the size of data is limited.
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- (10 more...)
A State-of-the-Art Morphosyntactic Parser and Lemmatizer for Ancient Greek
This paper presents an experiment consisting in the comparison of six models to identify a state-of-the-art morphosyntactic parser and lemmatizer for Ancient Greek capable of annotating according to the Ancient Greek Dependency Treebank annotation scheme. A normalized version of the major collections of annotated texts was used to (i) train the baseline model Dithrax with randomly initialized character embeddings and (ii) fine-tune Trankit and four recent models pretrained on Ancient Greek texts, i.e., GreBERTa and PhilBERTa for morphosyntactic annotation and GreTA and PhilTa for lemmatization. A Bayesian analysis shows that Dithrax and Trankit annotate morphology practically equivalently, while syntax is best annotated by Trankit and lemmata by GreTa. The results of the experiment suggest that token embeddings are not sufficient to achieve high UAS and LAS scores unless they are coupled with a modeling strategy specifically designed to capture syntactic relationships. The dataset and best-performing models are made available online for reuse.
- Research Report > Experimental Study (0.67)
- Research Report > New Finding (0.66)
ACE: A LLM-based Negotiation Coaching System
Shea, Ryan, Kallala, Aymen, Liu, Xin Lucy, Morris, Michael W., Yu, Zhou
The growing prominence of LLMs has led to an increase in the development of AI tutoring systems. These systems are crucial in providing underrepresented populations with improved access to valuable education. One important area of education that is unavailable to many learners is strategic bargaining related to negotiation. To address this, we develop a LLM-based Assistant for Coaching nEgotiation (ACE). ACE not only serves as a negotiation partner for users but also provides them with targeted feedback for improvement. To build our system, we collect a dataset of negotiation transcripts between MBA students. These transcripts come from trained negotiators and emulate realistic bargaining scenarios. We use the dataset, along with expert consultations, to design an annotation scheme for detecting negotiation mistakes. ACE employs this scheme to identify mistakes and provide targeted feedback to users. To test the effectiveness of ACE-generated feedback, we conducted a user experiment with two consecutive trials of negotiation and found that it improves negotiation performances significantly compared to a system that doesn't provide feedback and one which uses an alternative method of providing feedback.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > California (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- (2 more...)
What distinguishes conspiracy from critical narratives? A computational analysis of oppositional discourse
Korenčić, Damir, Chulvi, Berta, Casals, Xavier Bonet, Toselli, Alejandro, Taulé, Mariona, Rosso, Paolo
The current prevalence of conspiracy theories on the internet is a significant issue, tackled by many computational approaches. However, these approaches fail to recognize the relevance of distinguishing between texts which contain a conspiracy theory and texts which are simply critical and oppose mainstream narratives. Furthermore, little attention is usually paid to the role of inter-group conflict in oppositional narratives. We contribute by proposing a novel topic-agnostic annotation scheme that differentiates between conspiracies and critical texts, and that defines span-level categories of inter-group conflict. We also contribute with the multilingual XAI-DisInfodemics corpus (English and Spanish), which contains a high-quality annotation of Telegram messages related to COVID-19 (5,000 messages per language). We also demonstrate the feasibility of an NLP-based automatization by performing a range of experiments that yield strong baseline solutions. Finally, we perform an analysis which demonstrates that the promotion of intergroup conflict and the presence of violence and anger are key aspects to distinguish between the two types of oppositional narratives, i.e., conspiracy vs. critical.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Norway > Western Norway > Vestland > Bergen (0.04)
- (11 more...)