Performance in a dialectal profiling task of LLMs for varieties of Brazilian Portuguese

Freitag, Raquel Meister Ko, de Gois, Túlio Sousa

arXiv.org Artificial Intelligence 

Advances in generative AI have enabled near-human responses, crucial for overcoming the Turing test Danziger [2018]. However, achieving this requires algorithms to replicate ethically questionable human behaviors, including biases learned by large language models (LLMs) Freitag [2021]. Biases can be explicit, consciously manipulated, or implicit, operating unconsciously through automatic associations. These biases affect generative AI in two key areas: the rules and filters applied during LLM fine-tuning, and the linguistic datasets used for training. However, the specifics of these biases--whether in rules, filters, or dataset selection--remain unclear Bender et al. [2021].