grp
A Unifying Human-Centered AI Fairness Framework
Rahman, Munshi Mahbubur, Pan, Shimei, Foulds, James R.
The increasing use of Artificial Intelligence (AI) in critical societal domains has amplified concerns about fairness, particularly regarding unequal treatment across sensitive attributes such as race, gender, and socioeconomic status. While there has been substantial work on ensuring AI fairness, navigating trade-offs between competing notions of fairness as well as predictive accuracy remains challenging, creating barriers to the practical deployment of fair AI systems. To address this, we introduce a unifying human-centered fairness framework that systematically covers eight distinct fairness metrics, formed by combining individual and group fairness, infra-marginal and intersectional assumptions, and outcome-based and equality-of-opportunity (EOO) perspectives. This structure allows stakeholders to align fairness interventions with their values and contextual considerations. The framework uses a consistent and easy-to-understand formulation for all metrics to reduce the learning curve for non-experts. Rather than privileging a single fairness notion, the framework enables stakeholders to assign weights across multiple fairness objectives, reflecting their priorities and facilitating multi-stakeholder compromises. We apply this approach to four real-world datasets: the UCI Adult census dataset for income prediction, the COMPAS dataset for criminal recidivism, the German Credit dataset for credit risk assessment, and the MEPS dataset for healthcare utilization. We show that adjusting weights reveals nuanced trade-offs between different fairness metrics. Finally, through case studies in judicial decision-making and healthcare, we demonstrate how the framework can inform practical and value-sensitive deployment of fair AI systems.
- North America > United States > Maryland > Baltimore County (0.14)
- North America > United States > Maryland > Baltimore (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Law > Civil Rights & Constitutional Law (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- Banking & Finance > Credit (0.88)
Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
Li, Xinghan, Wen, Haodong, Lyu, Kaifeng
Despite the popularity of the Adam optimizer in practice, most theoretical analyses study Stochastic Gradient Descent (SGD) as a proxy for Adam, and little is known about how the solutions found by Adam differ. In this paper, we show that Adam implicitly reduces a unique form of sharpness measure shaped by its adaptive updates, leading to qualitatively different solutions from SGD. More specifically, when the training loss is small, Adam wanders around the manifold of minimizers and takes semi-gradients to minimize this sharpness measure in an adaptive manner, a behavior we rigorously characterize through a continuous-time approximation using stochastic differential equations. We further demonstrate how this behavior differs from that of SGD in a well-studied setting: when training overparameterized models with label noise, SGD has been shown to minimize the trace of the Hessian matrix, $\tr(\mH)$, whereas we prove that Adam minimizes $\tr(\Diag(\mH)^{1/2})$ instead. In solving sparse linear regression with diagonal linear networks, this distinction enables Adam to achieve better sparsity and generalization than SGD. Finally, our analysis framework extends beyond Adam to a broad class of adaptive gradient methods, including RMSProp, Adam-mini, Adalayer and Shampoo, and provides a unified perspective on how these adaptive optimizers reduce sharpness, which we hope will offer insights for future optimizer design.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
Deep reinforcement learning reveals fewer sensors are needed for autonomous gust alleviation
Haughn, Kevin PT., Harvey, Christina, Inman, Daniel J.
Although both the public sector and defense agencies are interested in urban uncrewed aerial vehicle (UAV) mission performance, fixed winged aircraft are still incapable of adapting to the complex aerodynamics within a city environment [1, 2, 3, 4, 5, 6]. Currently, the most dynamic environments are dominated by multirotor flight vehicles; however, the highly maneuverable and responsive quadrotor design suffers from substantial weight and power constraints, limiting the operational range and on-board computational capabilities needed for autonomy [7, 8, 9, 10]. Current fixed wing UAVs have greater range but are not as maneuverable [11]. Counter to both rotorcraft and traditional fixed wing UAV design, birds can adapt their wing shape as the environment changes to achieve both efficient and maneuverable flight [12]. This ability supports birds of prey in navigating through complex environments [13], or rejecting perturbations in a gusty environment [14, 15].
- Europe > United Kingdom > England (0.28)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > District of Columbia > Washington (0.14)
- North America > United States > California > Yolo County > Davis (0.14)
- Transportation > Air (1.00)
- Energy > Oil & Gas > Upstream (1.00)
- Aerospace & Defense > Aircraft (1.00)
- (2 more...)
Why (and When) does Local SGD Generalize Better than SGD?
Gu, Xinran, Lyu, Kaifeng, Huang, Longbo, Arora, Sanjeev
Local SGD is a communication-efficient variant of SGD for large-scale training, where multiple GPUs perform SGD independently and average the model parameters periodically. It has been recently observed that Local SGD can not only achieve the design goal of reducing the communication overhead but also lead to higher test accuracy than the corresponding SGD baseline (Lin et al., 2020b), though the training regimes for this to happen are still in debate (Ortiz et al., 2021). This paper aims to understand why (and when) Local SGD generalizes better based on Stochastic Differential Equation (SDE) approximation. The main contributions of this paper include (i) the derivation of an SDE that captures the long-term behavior of Local SGD in the small learning rate regime, showing how noise drives the iterate to drift and diffuse after it has reached close to the manifold of local minima, (ii) a comparison between the SDEs of Local SGD and SGD, showing that Local SGD induces a stronger drift term that can result in a stronger effect of regularization, e.g., a faster reduction of sharpness, and (iii) empirical evidence validating that having a small learning rate and long enough training time enables the generalization improvement over SGD but removing either of the two conditions leads to no improvement.
MEDFAIR: Benchmarking Fairness for Medical Imaging
Zong, Yongshuo, Yang, Yongxin, Hospedales, Timothy
A multitude of work has shown that machine learning-based medical diagnosis systems can be biased against certain subgroups of people. This has motivated a growing number of bias mitigation algorithms that aim to address fairness issues in machine learning. However, it is difficult to compare their effectiveness in medical imaging for two reasons. First, there is little consensus on the criteria to assess fairness. Second, existing bias mitigation algorithms are developed under different settings, e.g., datasets, model selection strategies, backbones, and fairness metrics, making a direct comparison and evaluation based on existing results impossible. In this work, we introduce MEDFAIR, a framework to benchmark the fairness of machine learning models for medical imaging. MEDFAIR covers eleven algorithms from various categories, nine datasets from different imaging modalities, and three model selection criteria. Through extensive experiments, we find that the under-studied issue of model selection criterion can have a significant impact on fairness outcomes; while in contrast, state-of-the-art bias mitigation algorithms do not significantly improve fairness outcomes over empirical risk minimization (ERM) in both in-distribution and out-of-distribution settings. We evaluate fairness from various perspectives and make recommendations for different medical application scenarios that require different ethical principles. Our framework provides a reproducible and easy-to-use entry point for the development and evaluation of future bias mitigation algorithms in deep learning. Code is available at https://github.com/ys-zong/MEDFAIR.
- North America > United States > New York (0.04)
- North America > United States > California > Monterey County > Monterey (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.45)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)
- Government > Regional Government > North America Government > United States Government (0.45)
Privacy Preserving QoE Modeling using Collaborative Learning
Ickin, Selim, Vandikas, Konstantinos, Fiedler, Markus
Machine Learning based Quality of Experience (QoE) models potentially suffer from over-fitting due to limitations including low data volume, and limited participant profiles. This prevents models from becoming generic. Consequently, these trained models may under-perform when tested outside the experimented population. One reason for the limited datasets, which we refer in this paper as small QoE data lakes, is due to the fact that often these datasets potentially contain user sensitive information and are only collected throughout expensive user studies with special user consent. Thus, sharing of datasets amongst researchers is often not allowed. In recent years, privacy preserving machine learning models have become important and so have techniques that enable model training without sharing datasets but instead relying on secure communication protocols. Following this trend, in this paper, we present Round-Robin based Collaborative Machine Learning model training, where the model is trained in a sequential manner amongst the collaborated partner nodes. We benchmark this work using our customized Federated Learning mechanism as well as conventional Centralized and Isolated Learning methods.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
- Information Technology > Data Science > Data Mining > Big Data (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Learning Generalized Reactive Policies using Deep Neural Networks
Groshev, Edward, Goldstein, Maxwell, Tamar, Aviv, Srivastava, Siddharth, Abbeel, Pieter
We present a new approach to learning for planning, where knowledge acquired while solving a given set of planning problems is used to plan faster in related, but new problem instances. We show that a deep neural network can be used to learn and represent a \emph{generalized reactive policy} (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances. In contrast to prior efforts in this direction, our approach significantly reduces the dependence of learning on handcrafted domain knowledge or feature selection. Instead, the GRP is trained from scratch using a set of successful execution traces. We show that our approach can also be used to automatically learn a heuristic function that can be used in directed search algorithms. We evaluate our approach using an extensive suite of experiments on two challenging planning problem domains and show that our approach facilitates learning complex decision making policies and powerful heuristic functions with minimal human input. Videos of our results are available at goo.gl/Hpy4e3.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
The p-filter: multi-layer FDR control for grouped hypotheses
Barber, Rina Foygel, Ramdas, Aaditya
In many practical applications of multiple hypothesis testing using the False Discovery Rate (FDR), the given hypotheses can be naturally partitioned into groups, and one may not only want to control the number of false discoveries (wrongly rejected null hypotheses), but also the number of falsely discovered groups of hypotheses (we say a group is falsely discovered if at least one hypothesis within that group is rejected, when in reality the group contains only nulls). In this paper, we introduce the p-filter, a procedure which unifies and generalizes the standard FDR procedure by Benjamini and Hochberg and global null testing procedure by Simes. We first prove that our proposed method can simultaneously control the overall FDR at the finest level (individual hypotheses treated separately) and the group FDR at coarser levels (when such groups are user-specified). We then generalize the p-filter procedure even further to handle multiple partitions of hypotheses, since that might be natural in many applications. For example, in neuroscience experiments, we may have a hypothesis for every (discretized) location in the brain, and at every (discretized) timepoint: does the stimulus correlate with activity in location x at time t after the stimulus was presented? In this setting, one might want to group hypotheses by location and by time. Importantly, our procedure can handle multiple partitions which are nonhierarchical (i.e. one partition may arrange p-values by voxel, and another partition arranges them by time point; neither one is nested inside the other). We prove that our procedure controls FDR simultaneously across these multiple lay- ers, under assumptions that are standard in the literature: we do not need the hypotheses to be independent, but require a nonnegative dependence condition known as PRDS.
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.41)