Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian

التفاصيل البيبلوغرافية
العنوان:	Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian
المؤلفون:	Shamardina, Tatiana, Mikhailov, Vladislav, Chernianskii, Daniil, Fenogenova, Alena, Saidov, Marat, Valeeva, Anastasiya, Shavrina, Tatiana, Smurov, Ivan, Tutubalina, Elena, Artemova, Ekaterina
سنة النشر:	2022
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language
الوصف:	We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. The shared task dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks: machine translation, paraphrase generation, text summarization, text simplification. We also consider back-translation and zero-shot generation approaches. The human-written texts are collected from publicly available resources across multiple domains. The shared task consists of two sub-tasks: (i) to determine if a given text is automatically generated or written by a human; (ii) to identify the author of a given text. The first task is framed as a binary classification problem. The second task is a multi-class classification problem. We provide count-based and BERT-based baselines, along with the human evaluation on the first sub-task. A total of 30 and 8 systems have been submitted to the binary and multi-class sub-tasks, correspondingly. Most teams outperform the baselines by a wide margin. We publicly release our codebase, human evaluation results, and other materials in our GitHub repository (https://github.com/dialogue-evaluation/RuATD). Comment: Accepted to Dialogue-22
نوع الوثيقة:	Working Paper
DOI:	10.28995/2075-7182-2022-21-497-511
URL الوصول:	http://arxiv.org/abs/2206.01583
رقم الأكسشن:	edsarx.2206.01583
قاعدة البيانات:	arXiv

الوصف
DOI:	10.28995/2075-7182-2022-21-497-511