Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

التفاصيل البيبلوغرافية
العنوان: Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text
المؤلفون: Ebrahimi, Seyedeh Fatemeh, Azari, Karim Akhavan, Iravani, Amirmasoud, Qazvini, Arian, Sadeghi, Pouya, Taghavi, Zeinab Sadat, Sameti, Hossein
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
الوصف: Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately discerning MGTs.
Comment: 8 pages, 3 figures, 2 tables. Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.11774
رقم الأكسشن: edsarx.2407.11774
قاعدة البيانات: arXiv