Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

Ebrahimi, Seyedeh Fatemeh, Azari, Karim Akhavan, Iravani, Amirmasoud, Qazvini, Arian, Sadeghi, Pouya, Taghavi, Zeinab Sadat, Sameti, Hossein

Jul-16-2024–arXiv.org Artificial Intelligence

Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately discerning MGTs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jul-16-2024

arXiv.org PDF

Add feedback

Country:
- North America > Mexico
  - Mexico City > Mexico City (0.04)
- Asia > Middle East
  - Iran
    - Tehran Province > Tehran (0.05)
    - Razavi Khorasan Province > Mashhad (0.04)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found