Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation

Guimarães, Artur, Martins, Bruno, Magalhães, João

Aug-6-2024–arXiv.org Artificial Intelligence

Language Processing (NLP) tasks, including in the Our overall best submission to the task achieved assessment of textual entailment relations. However, a macro F1-score of 0.80 (1st place on the leaderboard), these models are heavily susceptible to shortcut a consistency score of 0.72 (15th), and a learning (Du et al., 2023), factual inconsistency faithfulness score of 0.83 (11th). Our method excels (Xie et al., 2023), and performance degradation in classification accuracy, but fails at being when exposed to data from specialized domains, robust to perturbations on the statements, i.e. predicting such as in the case of medical data.

clinical trial report, ctr, information, (14 more...)

arXiv.org Artificial Intelligence

Aug-6-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe > Portugal
  - Lisbon > Lisbon (0.41)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.55)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.88)
  - Machine Learning > Performance Analysis
    - Accuracy (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found