تقرير
PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?
العنوان: | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? |
---|---|
المؤلفون: | Petukhova, Kseniia, Kazakov, Roman, Kochmar, Ekaterina |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language, Computer Science - Artificial Intelligence, I.2.7 |
الوصف: | In this paper, we present our submission to the SemEval-2024 Task 8 "Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection", focusing on the detection of machine-generated texts (MGTs) in English. Specifically, our approach relies on combining embeddings from the RoBERTa-base with diversity features and uses a resampled training set. We score 12th from 124 in the ranking for Subtask A (monolingual track), and our results show that our approach is generalizable across unseen models and domains, achieving an accuracy of 0.91. Comment: 8 pages, 3 figures, 5 tables, to be published in the Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), for associated code, see https://github.com/sachertort/petkaz-semeval-m4 |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2404.05483 |
رقم الأكسشن: | edsarx.2404.05483 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |