AITopics | Credit

Collaborating Authors

Credit

f7be3ebca4980b59fe3f665011115395-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-2-2025, 08:22:26 GMT

Marginalisation Operation Suppose V can be split as V = Ṽ [ Ũ where we are interested in the causal structure over Ṽ and do not observe the variables Ũ. For all U 2 Ũ, add an edge Z! Z if the current graph contains Z! U! Z and then U Figure 5: Example of step one in the marginalisation, taken from Evans [22]. The explanation for the biases are given in Appendix E. In this section we analyse the datasets presented in Le Quy et al. [43] for the three biases we present For each bias we provide a justification of our decision. Therefore we drop them from the analysis. COMPAS have been well documented to exhibit all these biases and more [5].

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.69)

Industry:

Banking & Finance > Credit (0.48)
Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

The Fragility of Fairness: Causal Sensitivity Analysis for Fair Machine Learning Department of Statistics Department of Statistics University of Oxford

Neural Information Processing SystemsJun-2-2025, 08:22:22 GMT

Fairness metrics are a core tool in the fair machine learning literature (FairML), used to determine that ML models are, in some sense, "fair." Real-world data, however, are typically plagued by various measurement biases and other violated assumptions, which can render fairness assessments meaningless. We adapt tools from causal sensitivity analysis to the FairML context, providing a general framework which (1) accommodates effectively any combination of fairness metric and bias that can be posed in the "oblivious setting"; (2) allows researchers to investigate combinations of biases, resulting in non-linear sensitivity; and (3) enables flexible encoding of domain-specific constraints and assumptions. Employing this framework, we analyze the sensitivity of the most common parity metrics under 3 varieties of classifier across 14 canonical fairness datasets. Our analysis reveals the striking fragility of fairness assessments to even minor dataset biases. We show that causal sensitivity analysis provides a powerful and necessary toolkit for gauging the informativeness of parity metric evaluations. Our repository is available here.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.40)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance > Credit (0.67)
Education > Curriculum > Subject-Specific Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions

Neural Information Processing SystemsJun-2-2025, 05:08:21 GMT

As machine learning (ML) models are increasingly used in social domains to make consequential decisions about humans, they often have the power to reshape data distributions. Humans, as strategic agents, continuously adapt their behaviors in response to the learning system. As populations change dynamically, ML systems may need frequent updates to ensure high performance. However, acquiring highquality human-annotated samples can be highly challenging and even infeasible in social domains. A common practice to address this issue is using the model itself to annotate unlabeled data samples.

agent, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.14)
North America > United States > Maryland (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance > Credit (0.46)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

Offline Contextual Bandits with High Probability Fairness Guarantees Blossom Metevier 1 Stephen Giguere 1 Sarah Brockman

Neural Information Processing SystemsJun-1-2025, 17:56:54 GMT

We present RobinHood, an offline contextual bandit algorithm designed to satisfy a broad family of fairness constraints. Our algorithm accepts multiple fairness definitions and allows users to construct their own unique fairness definitions for the problem at hand. We provide a theoretical analysis of RobinHood, which includes a proof that it will not return an unfair solution with probability greater than a user-specified threshold. We validate our algorithm on three applications: a user study with an automated tutoring system, a loan approval setting using the Statlog German credit data set, and a criminal recidivism problem using data released by ProPublica. To demonstrate the versatility of our approach, we use multiple well-known and custom definitions of fairness. In each setting, our algorithm is able to produce fair policies that achieve performance competitive with other offline and online contextual bandit algorithms.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.93)

Industry:

Health & Medicine (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.35)
Banking & Finance > Credit (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.69)

Add feedback

Paradoxes in Fair Machine Learning

Paul Goelz, Anson Kahng, Ariel D. Procaccia

Neural Information Processing SystemsJun-1-2025, 10:46:35 GMT

Equalized odds is a statistical notion of fairness in machine learning that ensures that classification algorithms do not discriminate against protected groups. We extend equalized odds to the setting of cardinality-constrained fair classification, where we have a bounded amount of a resource to distribute. This setting coincides with classic fair division problems, which allows us to apply concepts from that literature in parallel to equalized odds. In particular, we consider the axioms of resource monotonicity, consistency, and population monotonicity, all three of which relate different allocation instances to prevent paradoxes. Using a geometric characterization of equalized odds, we examine the compatibility of equalized odds with these axioms. We empirically evaluate the cost of allocation rules that satisfy both equalized odds and axioms of fair division on a dataset of FICO credit scores.

allocation, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Banking & Finance > Credit (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)

Add feedback

A Benchmark Task Details

Neural Information Processing SystemsMay-30-2025, 19:15:17 GMT

The risk for lead exposure is disproportionately higher for children who are poor, non-Hispanic black, living in large metropolitan areas, or living in older housing. The CDC sets a national standard for blood lead levels in children. This value was established in 2012 to be 3.5 micrograms per decileter (µg/dL) of blood.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.93)
Questionnaire & Opinion Survey (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Public Health (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

251bd0442dfcc53b5a761e050f8022b8-Supplemental.pdf

Neural Information Processing SystemsMay-28-2025, 13:26:15 GMT

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.15)

Industry:

Banking & Finance > Credit (0.94)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Fair Wasserstein Coresets Freddy Lecue

Neural Information Processing SystemsMay-28-2025, 06:07:30 GMT

Data distillation and coresets have emerged as popular approaches to generate a smaller representative set of samples for downstream learning tasks to handle large-scale datasets. At the same time, machine learning is being increasingly applied to decision-making processes at a societal level, making it imperative for modelers to address inherent biases towards subgroups present in the data. While current approaches focus on creating fair synthetic representative samples by optimizing local properties relative to the original samples, their impact on downstream learning processes has yet to be explored. In this work, we present fair Wasserstein coresets (FWC), a novel coreset approach which generates fair synthetic representative samples along with sample-level weights to be used in downstream learning tasks. FWC uses an efficient majority minimization algorithm to minimize the Wasserstein distance between the original dataset and the weighted synthetic samples while enforcing demographic parity. We show that an unconstrained version of FWC is equivalent to Lloyd's algorithm for k-medians and k-means clustering. Experiments conducted on both synthetic and real datasets show that FWC: (i) achieves a competitive fairness-utility tradeoff in downstream models compared to existing approaches, (ii) improves downstream fairness when added to the existing training data and (iii) can be used to reduce biases in predictions from large language models (GPT-3.5 and GPT-4).

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: