AITopics | exponentiated gradient

A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks

Boubekraoui, Maryam, d'Aloisio, Giordano, Di Marco, Antinisca

arXiv.org Machine LearningMar-24-2026

The widespread use of AI and ML models in sensitive areas raises significant concerns about fairness. While the research community has introduced various methods for bias mitigation in binary classification tasks, the issue remains under-explored in multi-class classification settings. To address this limitation, in this paper, we first formulate the problem of fair learning in multi-class classification as a multi-objective problem between effectiveness (i.e., prediction correctness) and multiple linear fairness constraints. Next, we propose a Generalised Exponentiated Gradient (GEG) algorithm to solve this task. GEG is an in-processing algorithm that enhances fairness in binary and multi-class classification settings under multiple fairness definitions. We conduct an extensive empirical evaluation of GEG against six baselines across seven multi-class and three binary datasets, using four widely adopted effectiveness metrics and three fairness definitions. GEG overcomes existing baselines, with fairness improvements up to 92% and a decrease in accuracy up to 14%.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Machine Learning

2603.21393

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Abruzzo > L'Aquila Province > L'Aquila (0.04)
South America > Peru (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Consumer Health (0.67)
Health & Medicine > Therapeutic Area (0.46)
Education > Educational Setting (0.46)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Mitigating Source Bias for Fairer Weak Supervision

Neural Information Processing SystemsFeb-13-2026, 09:41:33 GMT

Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness--in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32% while reducing demographic parity gap by 82.5%.

exponentiated gradient, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.92)
(4 more...)

Add feedback

Mitigating Source Bias for Fairer Weak Supervision

Neural Information Processing SystemsOct-8-2025, 20:36:25 GMT

Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness--in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32% while reducing demographic parity gap by 82.5%.

exponentiated gradient, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.92)
(4 more...)

Add feedback

Equitable Length of Stay Prediction for Patients with Learning Disabilities and Multiple Long-term Conditions Using Machine Learning

Abakasanga, Emeka, Kousovista, Rania, Cosma, Georgina, Akbari, Ashley, Zaccardi, Francesco, Kaur, Navjot, Fitt, Danielle, Jun, Gyuchan Thomas, Kiani, Reza, Gangadharan, Satheesh

arXiv.org Artificial IntelligenceNov-3-2024

People with learning disabilities have a higher mortality rate and premature deaths compared to the general public, as reported in published research in the UK and other countries. This study analyses hospitalisations of 9,618 patients identified with learning disabilities and long-term conditions for the population of Wales using electronic health record (EHR) data sources from the SAIL Databank. We describe the demographic characteristics, prevalence of long-term conditions, medication history, hospital visits, and lifestyle history for our study cohort, and apply machine learning models to predict the length of hospital stays for this cohort. The random forest (RF) model achieved an Area Under the Curve (AUC) of 0.759 (males) and 0.756 (females), a false negative rate of 0.224 (males) and 0.229 (females), and a balanced accuracy of 0.690 (males) and 0.689 (females). After examining model performance across ethnic groups, two bias mitigation algorithms (threshold optimization and the reductions algorithm using an exponentiated gradient) were applied to minimise performance discrepancies. The threshold optimizer algorithm outperformed the reductions algorithm, achieving lower ranges in false positive rate and balanced accuracy for the male cohort across the ethnic groups. This study demonstrates the potential of applying machine learning models with effective bias mitigation approaches on EHR data sources to enable equitable prediction of hospital stays by addressing data imbalances across groups.

admission, arxiv template, ethnic group, (13 more...)

arXiv.org Artificial Intelligence

2411.08048

Country:

Europe > United Kingdom > Wales > Swansea (0.14)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.05)
Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
North America > United States (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Mitigating Source Bias for Fairer Weak Supervision

Shin, Changho, Cromp, Sonia, Adila, Dyah, Sala, Frederic

arXiv.org Machine LearningNov-29-2023

Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak supervision has not been studied from the point of view of fairness. We begin such a study, starting with the observation that even when a fair model can be built from a dataset with access to ground-truth labels, the corresponding dataset labeled via weak supervision can be arbitrarily unfair. To address this, we propose and empirically validate a model for source unfairness in weak supervision, then introduce a simple counterfactual fairness-based technique that can mitigate these biases. Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness -- in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32\% while reducing demographic parity gap by 82.5\%. A simple extension of our method aimed at maximizing performance produces state-of-the-art performance in five out of ten datasets in the WRENCH benchmark.

exponentiated gradient, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2303.17713

Country:

Asia > China (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(3 more...)

Add feedback

Competitive On-line Linear Regression

Vovk, Volodya

Neural Information Processing SystemsDec-31-1998

We apply a general algorithm for merging prediction strategies (the Aggregating Algorithm) to the problem of linear regression with the square loss; our main assumption is that the response variable is bounded. It turns out that for this particular problem the Aggregating Algorithm resembles, but is slightly different from, the wellknown ridge estimation procedure. From general results about the Aggregating Algorithm we deduce a guaranteed bound on the difference between our algorithm's performance and the best, in some sense, linear regression function's performance. We show that the AA attains the optimal constant in our bound, whereas the constant attained by the ridge regression procedure in general can be 4 times worse. 1 INTRODUCTION The usual approach to regression problems is to assume that the data are generated by some stochastic mechanism and make some, typically very restrictive, assumptions about that stochastic mechanism. In recent years, however, a different approach to this kind of problems was developed (see, e.g., DeSantis et al. [2], Littlestone and Warmuth [7]): in our context, that approach sets the goal of finding an online algorithm that performs not much worse than the best regression function found off-line; in other words, it replaces the usual statistical analyses by the competitive analysis of online algorithms. DeSantis et al. [2] performed a competitive analysis of the Bayesian merging scheme for the log-loss prediction game; later Littlestone and Warmuth [7] and Vovk [10] introduced an online algorithm (called the Weighted Majority Algorithm by the Competitive Online Linear Regression 365 former authors) for the simple binary prediction game. These two algorithms (the Bayesian merging scheme and the Weighted Majority Algorithm) are special cases of the Aggregating Algorithm (AA) proposed in [9, 11]. The AA is a member of a wide family of algorithms called "multiplicative weight" or "exponential weight" algorithms. Closer to the topic of this paper, Cesa-Bianchi et al. [1) performed a competitive analysis, under the square loss, of the standard Gradient Descent Algorithm and Kivinen and Warmuth [6] complemented it by a competitive analysis of a modification of the Gradient Descent, which they call the Exponentiated Gradient Algorithm.

algorithm, learner, ridge regression procedure, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Competitive On-line Linear Regression

Vovk, Volodya

Neural Information Processing SystemsDec-31-1998

We apply a general algorithm for merging prediction strategies (the Aggregating Algorithm) to the problem of linear regression with the square loss; our main assumption is that the response variable is bounded. It turns out that for this particular problem the Aggregating Algorithm resembles, but is slightly different from, the wellknown ridge estimation procedure. From general results about the Aggregating Algorithm we deduce a guaranteed bound on the difference between our algorithm's performance and the best, in some sense, linear regression function's performance. We show that the AA attains the optimal constant in our bound, whereas the constant attained by the ridge regression procedure in general can be 4 times worse. 1 INTRODUCTION The usual approach to regression problems is to assume that the data are generated by some stochastic mechanism and make some, typically very restrictive, assumptions about that stochastic mechanism. In recent years, however, a different approach to this kind of problems was developed (see, e.g., DeSantis et al. [2], Littlestone and Warmuth [7]): in our context, that approach sets the goal of finding an online algorithm that performs not much worse than the best regression function found off-line; in other words, it replaces the usual statistical analyses by the competitive analysis of online algorithms. DeSantis et al. [2] performed a competitive analysis of the Bayesian merging scheme for the log-loss prediction game; later Littlestone and Warmuth [7] and Vovk [10] introduced an online algorithm (called the Weighted Majority Algorithm by the Competitive Online Linear Regression 365 former authors) for the simple binary prediction game. These two algorithms (the Bayesian merging scheme and the Weighted Majority Algorithm) are special cases of the Aggregating Algorithm (AA) proposed in [9, 11]. The AA is a member of a wide family of algorithms called "multiplicative weight" or "exponential weight" algorithms. Closer to the topic of this paper, Cesa-Bianchi et al. [1) performed a competitive analysis, under the square loss, of the standard Gradient Descent Algorithm and Kivinen and Warmuth [6] complemented it by a competitive analysis of a modification of the Gradient Descent, which they call the Exponentiated Gradient Algorithm.

algorithm, learner, ridge regression procedure, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Competitive On-line Linear Regression

Vovk, Volodya

Neural Information Processing SystemsDec-31-1998

We apply a general algorithm for merging prediction strategies (the Aggregating Algorithm) to the problem of linear regression with the square loss; our main assumption is that the response variable is bounded. It turns out that for this particular problem the Aggregating Algorithmresembles, but is slightly different from, the wellknown ridgeestimation procedure. From general results about the Aggregating Algorithm we deduce a guaranteed bound on the difference betweenour algorithm's performance and the best, in some sense, linear regression function's performance. We show that the AA attains the optimal constant in our bound, whereas the constant attainedby the ridge regression procedure in general can be 4 times worse. 1 INTRODUCTION The usual approach to regression problems is to assume that the data are generated bysome stochastic mechanism and make some, typically very restrictive, assumptions about that stochastic mechanism. In recent years, however, a different approach to this kind of problems was developed (see, e.g., DeSantis et al. [2], Littlestone andWarmuth [7]): in our context, that approach sets the goal of finding an online algorithm that performs not much worse than the best regression function foundoff-line; in other words, it replaces the usual statistical analyses by the competitive analysis of online algorithms. DeSantis et al. [2] performed a competitive analysis of the Bayesian merging scheme for the log-loss prediction game; later Littlestone and Warmuth [7] and Vovk [10] introduced an online algorithm (called the Weighted Majority Algorithm by the Competitive Online Linear Regression 365 former authors) for the simple binary prediction game. These two algorithms (the Bayesian merging scheme and the Weighted Majority Algorithm) are special cases of the Aggregating Algorithm (AA) proposed in [9, 11]. The AA is a member of a wide family of algorithms called "multiplicative weight" or "exponential weight" algorithms. Closerto the topic of this paper, Cesa-Bianchi et al. [1) performed a competitive analysis, under the square loss, of the standard Gradient Descent Algorithm and Kivinen and Warmuth [6] complemented it by a competitive analysis of a modification ofthe Gradient Descent, which they call the Exponentiated Gradient Algorithm.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback