AITopics | identifying bias

Collaborating Authors

identifying bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identifying Algorithmic and Domain-Specific Bias in Parliamentary Debate Summarisation

Cunningham, Eoghan, Cross, James, Greene, Derek

arXiv.org Artificial IntelligenceJul-22-2025

The automated summarisation of parliamentary debates using large language models (LLMs) offers a promising way to make complex legislative discourse more accessible to the public. However, such summaries must not only be accurate and concise but also equitably represent the views and contributions of all speakers. This paper explores the use of LLMs to summarise plenary debates from the European Parliament and investigates the algorithmic and representational biases that emerge in this context. We propose a structured, multi-stage summarisation framework that improves textual coherence and content fidelity, while enabling the systematic analysis of how speaker attributes -- such as speaking order or political affiliation -- influence the visibility and accuracy of their contributions in the final summaries. Through our experiments using both proprietary and open-weight LLMs, we find evidence of consistent positional and partisan biases, with certain speakers systematically under-represented or misattributed. Our analysis shows that these biases vary by model and summarisation strategy, with hierarchical approaches offering the greatest potential to reduce disparity. These findings underscore the need for domain-sensitive evaluation metrics and ethical oversight in the deployment of LLMs for democratic applications.

intervention, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.14221

Country: Europe (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Identifying bias in cluster quality metrics

Renedo-Mirambell, Martí, Arratia, Argimiro

arXiv.org Artificial IntelligenceJul-8-2025

We study potential biases of popular cluster quality metrics, such as conductance or modularity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, to which quality metrics will be applied. These models also allow us to generate multi-level structures of varying strength, which will show if metrics favour partitions into a larger or smaller number of clusters. Additionally, we propose another quality metric, the density ratio. We observed that most of the studied metrics tend to favour partitions into a smaller number of big clusters, even when their relative internal and external connectivity are the same. The metrics found to be less biased are modularity and density ratio.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.7717/peerj-cs.1523

2112.06287

Country: Europe > Spain (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)

Add feedback

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

Chaudhari, Bhushan, Agarwal, Akash, Bhowmik, Tanmoy

arXiv.org Artificial IntelligenceOct-24-2022

Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It's a well founded and intuitive fact that existing bias mitigation strategies often sacrifice accuracy in order to ensure fairness. But when AI engine's prediction is used for decision making which reflects on revenue or operational efficiency such as credit risk modelling, it would be desirable by the business if accuracy can be somehow reasonably preserved. This conflicting requirement of maintaining accuracy and fairness in AI motivates our research. In this paper, we propose a fresh approach for simultaneous improvement of fairness and accuracy of ML models within a realistic paradigm. The essence of our work is a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training and we further show that such instance removal will have no adverse impact on model accuracy. In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes , an inherent bias gets induced in the dataset, which can be identified and mitigated through our novel scheme. Our experimental evaluation on two open-source datasets demonstrates how the proposed method can mitigate bias along with improving rather than degrading accuracy, while offering certain set of control for end user.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2210.13182

Country:

North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.65)

Industry: Banking & Finance > Credit (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.49)

Add feedback

Identifying biases in legal data: An algorithmic fairness perspective

Sargent, Jackson, Weber, Melanie

arXiv.org Machine LearningSep-21-2021

As artificial intelligence enters the legal space, it is essential to recognize biases in legal data and ensure that they are not replicated and reinforced with legal technology [7, 13, 18]. Furthermore, understanding biases in legal data and developing discrimination-free technology could help the legal space to become fairer and more widely accessible. We typically find two types of biases in legal data: First, representation biases, i.e., certain social groups are over-or underrepresented in a data set. Second, sentencing disparities, i.e., the outcome of legal proceedings for similar cases varies across social groups. Representation biases may reflect disparities in policing (arrest rates) or in offense rates.

demographic group, fairness metric, legal data, (13 more...)

arXiv.org Machine Learning

2109.09946

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.94)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Add feedback

Identifying Bias in Hospital Length of Stay Algorithm

#artificialintelligenceMay-1-2020, 22:40:02 GMT

Recognizing the need to support shorter lengths of stay, Dr. John Fahrenbach, a data scientist at the University of Chicago Medicine (UCM), developed a machine learning model that used clinical characteristics to identify patients most suitable for discharge after 48 hours. Using this tool, the hospital could ensure the timely release of specific patients by allocating and prioritizing care management resources, including discharge planning, home health services, and clinician or patient administrative assistance. During the development process, Dr. Fahrenbach's team determined that including zip codes as a feature increased the model's predictive accuracy. After introducing zip codes into the model, however, a team member who reviewed the output raised concerns. "We know Chicago's patient population and knew something was off when stratifying the model by race," said Dr. Fahrenbach.

fahrenbach, identifying bias, zip code, (10 more...)

#artificialintelligence

Country: North America > United States > Illinois > Cook County > Chicago (0.49)

Industry: Health & Medicine > Health Care Providers & Services (0.94)

Technology:

Information Technology > Data Science (0.98)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback