On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity

التفاصيل البيبلوغرافية
العنوان: On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity
المؤلفون: Rivas, Pablo, Cerny, Tomas, Perez, Alejandro Rodriguez, Turek, Javier, Giddens, Laurie, Bichler, Gisela, Petter, Stacie
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, I.2.7
الوصف: Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a reproducible and automated methodology to analyze five million advertisements. In the process, we identified further challenges in dataset creation within this sensitive domain. This paper presents a streamlined methodology to assist researchers in constructing effective datasets for combating organized crime, allowing them to focus on advancing detection technologies.
Comment: LXAI Workshop at the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2405.13348
رقم الأكسشن: edsarx.2405.13348
قاعدة البيانات: arXiv