SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

التفاصيل البيبلوغرافية
العنوان: SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
المؤلفون: Wang, Yuxia, Mansurov, Jonibek, Ivanov, Petar, Su, Jinyan, Shelmanov, Artem, Tsvigun, Akim, Afzal, Osama Mohammed, Mahmoud, Tarek, Puccetti, Giovanni, Arnold, Thomas, Whitehouse, Chenxi, Aji, Alham Fikri, Habash, Nizar, Gurevych, Iryna, Nakov, Preslav
المصدر: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language
الوصف: We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.
Comment: 23 pages, 12 tables
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2404.14183
رقم الأكسشن: edsarx.2404.14183
قاعدة البيانات: arXiv