On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity

Rivas, Pablo, Cerny, Tomas, Perez, Alejandro Rodriguez, Turek, Javier, Giddens, Laurie, Bichler, Gisela, Petter, Stacie

arXiv.org Artificial Intelligence 

Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a reproducible and automated methodology to analyze five million advertisements. In the process, we identified further challenges in dataset creation within this sensitive domain. This paper presents a streamlined methodology to assist researchers Figure 1: Methodology to generate a pseudo-labeled in constructing effective datasets for combating dataset in human trafficking risk prediction and organized organized crime, allowing them to focus on activity detection tasks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found