Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation

Guimarães, Artur, Martins, Bruno, Magalhães, João

arXiv.org Artificial Intelligence 

Language Processing (NLP) tasks, including in the Our overall best submission to the task achieved assessment of textual entailment relations. However, a macro F1-score of 0.80 (1st place on the leaderboard), these models are heavily susceptible to shortcut a consistency score of 0.72 (15th), and a learning (Du et al., 2023), factual inconsistency faithfulness score of 0.83 (11th). Our method excels (Xie et al., 2023), and performance degradation in classification accuracy, but fails at being when exposed to data from specialized domains, robust to perturbations on the statements, i.e. predicting such as in the case of medical data.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found