human entity
Social Biases in Knowledge Representations of Wikidata separates Global North from Global South
Das, Paramita, Karnam, Sai Keerthana, Soni, Aditya, Mukherjee, Animesh
Knowledge Graphs have become increasingly popular due to their wide usage in various downstream applications, including information retrieval, chatbot development, language model construction, and many others. Link prediction (LP) is a crucial downstream task for knowledge graphs, as it helps to address the problem of the incompleteness of the knowledge graphs. However, previous research has shown that knowledge graphs, often created in a (semi) automatic manner, are not free from social biases. These biases can have harmful effects on downstream applications, especially by leading to unfair behavior toward minority groups. To understand this issue in detail, we develop a framework -- AuditLP -- deploying fairness metrics to identify biased outcomes in LP, specifically how occupations are classified as either male or female-dominated based on gender as a sensitive attribute. We have experimented with the sensitive attribute of age and observed that occupations are categorized as young-biased, old-biased, and age-neutral. We conduct our experiments on a large number of knowledge triples that belong to 21 different geographies extracted from the open-sourced knowledge graph, Wikidata. Our study shows that the variance in the biased outcomes across geographies neatly mirrors the socio-economic and cultural division of the world, resulting in a transparent partition of the Global North from the Global South.
- Europe > Germany (0.05)
- Europe > France (0.05)
- North America > Mexico (0.05)
- (21 more...)
- Leisure & Entertainment > Sports (1.00)
- Media (0.68)
- Health & Medicine (0.68)
- Banking & Finance (0.68)
The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs
Sant, Aleix, Escolano, Carlos, Mash, Audrey, Fornaciari, Francesca De Luca, Melero, Maite
This paper studies gender bias in machine translation through the lens of Large Language Models (LLMs). Four widely-used test sets are employed to benchmark various base LLMs, comparing their translation quality and gender bias against state-of-the-art Neural Machine Translation (NMT) models for English to Catalan (En $\rightarrow$ Ca) and English to Spanish (En $\rightarrow$ Es) translation directions. Our findings reveal pervasive gender bias across all models, with base LLMs exhibiting a higher degree of bias compared to NMT models. To combat this bias, we explore prompting engineering techniques applied to an instruction-tuned LLM. We identify a prompt structure that significantly reduces gender bias by up to 12% on the WinoMT evaluation dataset compared to more straightforward prompts. These results significantly reduce the gender bias accuracy gap between LLMs and traditional NMT systems.
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.05)
- South America > Argentina > Pampas > Buenos Aires Province (0.04)
- Africa > Southern Africa (0.04)
- (22 more...)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Linking Named Entities in Diderot's \textit{Encyclop\'edie} to Wikidata
Diderot's \textit{Encyclop\'edie} is a reference work from XVIIIth century in Europe that aimed at collecting the knowledge of its era. \textit{Wikipedia} has the same ambition with a much greater scope. However, the lack of digital connection between the two encyclopedias may hinder their comparison and the study of how knowledge has evolved. A key element of \textit{Wikipedia} is Wikidata that backs the articles with a graph of structured data. In this paper, we describe the annotation of more than 10,300 of the \textit{Encyclop\'edie} entries with Wikidata identifiers enabling us to connect these entries to the graph. We considered geographic and human entities. The \textit{Encyclop\'edie} does not contain biographic entries as they mostly appear as subentries of locations. We extracted all the geographic entries and we completely annotated all the entries containing a description of human entities. This represents more than 2,600 links referring to locations or human entities. In addition, we annotated more than 9,500 entries having a geographic content only. We describe the annotation process as well as application examples. This resource is available at https://github.com/pnugues/encyclopedie_1751
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.07)
- Europe > Italy (0.05)
- Europe > Greece (0.05)
- (8 more...)
Towards the Human Digital Twin: Definition and Design -- A survey
Lauer-Schmaltz, Martin Wolfgang, Cash, Philip, Hansen, John Paulin, Maier, Anja
Digital Twins (DTs) are a critical technology for digitalizing physical entities in domains ranging from industry to city planning [1, 2]. DTs' ability to continuously adapt to a physical entity's state, simulate future events, and actively influence feedback and decision processes, goes significantly beyond traditional digital models as merely representations [3]. Thus, Industry 4.0 has started using DTs--along with other cutting-edge technologies, such as the Internet of Things (IoT), Big Data, and Artificial Intelligence (AI)--to significantly increase the efficiency and safety of both products and processes [3]. Further, due to DTs' real-time monitoring and simulation capabilities, they are being increasingly adapted to domains such as healthcare to meet demands for individualized diagnostics and treatment [4].
- North America > United States (1.00)
- Europe > Denmark (0.28)
- Europe > United Kingdom > England (0.14)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.67)