The first prominent algorithms in this line of research isADAGRAD [7,22], which uses a per-dimension learning rate based on squared pastgradients.ADAGRADachievedsignificant performance gainsincomparison toSGDwhenthe gradientsaresparse.
Counterfactual Fairness (CF), which posits that an ML model's outcome on any individual should remain unchanged if they had belonged to a different demographic