AITopics | Wang, Serena

Collaborating Authors

Wang, Serena

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Differential Privacy on Trust Graphs

Ghazi, Badih, Kumar, Ravi, Manurangsi, Pasin, Wang, Serena

arXiv.org Artificial IntelligenceOct-15-2024

We study differential privacy (DP) in a multi-party setting where each party only trusts a (known) subset of the other parties with its data. Specifically, given a trust graph where vertices correspond to parties and neighbors are mutually trusting, we give a DP algorithm for aggregation with a much better privacy-utility trade-off than in the well-studied local model of DP (where each party trusts no other party). We further study a robust variant where each party trusts all but an unknown subset of at most $t$ of its neighbors (where $t$ is a given parameter), and give an algorithm for this setting. We complement our algorithms with lower bounds, and discuss implications of our work to other tasks in private learning and analytics.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.12045

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.68)
(2 more...)

Add feedback

Score Design for Multi-Criteria Incentivization

Kabra, Anmol, Karzand, Mina, Lechner, Tosca, Srebro, Nathan, Wang, Serena

arXiv.org Artificial IntelligenceOct-8-2024

We present a framework for designing scores to summarize performance metrics. Our design has two multi-criteria objectives: (1) improving on scores should improve all performance metrics, and (2) achieving pareto-optimal scores should achieve pareto-optimal metrics. We formulate our design to minimize the dimensionality of scores while satisfying the objectives. We give algorithms to design scores, which are provably minimal under mild assumptions on the structure of performance metrics. This framework draws motivation from real-world practices in hospital rating systems, where misaligned scores and performance metrics lead to unintended consequences.

artificial intelligence, machine learning, restriction, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.4230/LIPIcs.FORC.2024.8

2410.0629

Country: North America > United States (1.00)

Genre: Research Report (0.81)

Industry: Health & Medicine > Health Care Providers & Services > Reimbursement (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Operationalizing Counterfactual Metrics: Incentives, Ranking, and Information Asymmetry

Wang, Serena, Bates, Stephen, Aronow, P. M., Jordan, Michael I.

arXiv.org Artificial IntelligenceNov-29-2023

From the social sciences to machine learning, it has been well documented that metrics to be optimized are not always aligned with social welfare. In healthcare, Dranove et al. (2003) showed that publishing surgery mortality metrics actually harmed the welfare of sicker patients by increasing provider selection behavior. We analyze the incentive misalignments that arise from such average treated outcome metrics, and show that the incentives driving treatment decisions would align with maximizing total patient welfare if the metrics (i) accounted for counterfactual untreated outcomes and (ii) considered total welfare instead of averaging over treated patients. Operationalizing this, we show how counterfactual metrics can be modified to behave reasonably in patient-facing ranking systems. Extending to realistic settings when providers observe more about patients than the regulatory agencies do, we bound the decay in performance by the degree of information asymmetry between principal and agent. In doing so, our model connects principal-agent information asymmetry with unobserved heterogeneity in causal inference.

agent, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.14595

Country: North America > United States > California (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Jerfel, Ghassen, Wang, Serena, Fannjiang, Clara, Heller, Katherine A., Ma, Yian, Jordan, Michael I.

arXiv.org Machine LearningJun-30-2021

Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior leading to miscalibration and potential degeneracy. Importance sampling (IS), on the other hand, is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. The quality of IS crucially depends on the choice of the proposal distribution. Ideally, the proposal distribution has heavier tails than the target, which is rarely achievable by minimizing the RKL. We thus propose a novel combination of optimization and sampling techniques for approximate Bayesian inference by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence. This approach guarantees asymptotic consistency and a fast convergence towards both the optimal IS estimator and the optimal variational approximation. We empirically demonstrate on real data that our method is competitive with variational boosting and MCMC.

artificial intelligence, forward kullback-leibler divergence, variational refinement, (1 more...)

arXiv.org Machine Learning

2106.1598

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.93)

Add feedback

Multi-Source Causal Inference Using Control Variates

Guo, Wenshuo, Wang, Serena, Ding, Peng, Wang, Yixin, Jordan, Michael I.

arXiv.org Machine LearningJun-5-2021

While many areas of machine learning have benefited from the increasing availability of large and varied datasets, the benefit to causal inference has been limited given the strong assumptions needed to ensure identifiability of causal effects; these are often not satisfied in real-world datasets. For example, many large observational datasets (e.g., case-control studies in epidemiology, click-through data in recommender systems) suffer from selection bias on the outcome, which makes the average treatment effect (ATE) unidentifiable. We propose a general algorithm to estimate causal effects from \emph{multiple} data sources, where the ATE may be identifiable only in some datasets but not others. The key idea is to construct control variates using the datasets in which the ATE is not identifiable. We show theoretically that this reduces the variance of the ATE estimate. We apply this framework to inference from observational data under outcome selection bias, assuming access to an auxiliary small dataset from which we can obtain a consistent estimate of the ATE. We construct a control variate by taking the difference of the odds ratio estimates from the two datasets. Across simulations and two case studies with real data, we show that this control variate can significantly reduce the variance of the ATE estimate.

artificial intelligence, health & medicine, machine learning, (17 more...)

arXiv.org Machine Learning

2103.16689

Country: North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Regularization Strategies for Quantile Regression

Narayan, Taman, Wang, Serena, Canini, Kevin, Gupta, Maya

arXiv.org Machine LearningFeb-9-2021

We investigate different methods for regularizing quantile regression when predicting either a subset of quantiles or the full inverse CDF. We show that minimizing an expected pinball loss over a continuous distribution of quantiles is a good regularizer even when only predicting a specific quantile. For predicting multiple quantiles, we propose achieving the classic goal of non-crossing quantiles by using deep lattice networks that treat the quantile as a monotonic input feature, and we discuss why monotonicity on other features is an apt regularizer for quantile regression. We show that lattice models enable regularizing the predicted distribution to a location-scale family. Lastly, we propose applying rate constraints to improve the calibration of the quantile predictions on specific subsets of interest and improve fairness metrics. We demonstrate our contributions on simulations, benchmark datasets, and real quantile regression problems.

neural network, quantile, survey article, (18 more...)

arXiv.org Machine Learning

2102.05135

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Pairwise Fairness for Ranking and Regression

Narasimhan, Harikrishna, Cotter, Andrew, Gupta, Maya, Wang, Serena

arXiv.org Machine LearningJun-12-2019

We present pairwise metrics of fairness for ranking and regression models that form analogues of statistical fairness notions such as equal opportunity or equal accuracy, as well as statistical parity. Our pairwise formulation supports both discrete protected groups, and continuous protected attributes. We show that the resulting training problems can be efficiently and effectively solved using constrained optimization and robust optimization techniques based on two player game algorithms developed for fair classification. Experiments illustrate the broad applicability and trade-offs of these methods.

constraint, educational setting, optimization problem, (18 more...)

arXiv.org Machine Learning

1906.0533

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals

Cotter, Andrew, Jiang, Heinrich, Wang, Serena, Narayan, Taman, Gupta, Maya, You, Seungil, Sridharan, Karthik

arXiv.org Machine LearningSep-11-2018

We show that many machine learning goals, such as improved fairness metrics, can be expressed as constraints on the model's predictions, which we call rate constraints. We study the problem of training non-convex models subject to these rate constraints (or any non-convex and non-differentiable constraints). In the non-convex setting, the standard approach of Lagrange multipliers may fail. Furthermore, if the constraints are non-differentiable, then one cannot optimize the Lagrangian with gradient-based methods. To solve these issues, we introduce the proxy-Lagrangian formulation. This new formulation leads to an algorithm that produces a stochastic classifier by playing a two-player non-zero-sum game solving for what we call a semi-coarse correlated equilibrium, which in turn corresponds to an approximately optimal and feasible solution to the constrained optimization problem. We then give a procedure which shrinks the randomized solution down to one that is a mixture of at most $m+1$ deterministic solutions, given $m$ constraints. This culminates in algorithms that can solve non-convex constrained optimization problems with possibly non-differentiable and non-convex constraints with theoretical guarantees. We provide extensive experimental results enforcing a wide range of policy goals including different fairness metrics, and other goals on accuracy, coverage, recall, and churn.

artificial intelligence, constraint, optimization problem, (18 more...)

arXiv.org Machine Learning

1809.04198

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.92)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints

Cotter, Andrew, Gupta, Maya, Jiang, Heinrich, Srebro, Nathan, Sridharan, Karthik, Wang, Serena, Woodworth, Blake, You, Seungil

arXiv.org Machine LearningJun-29-2018

Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals. We study the generalization performance for such constrained optimization problems, in terms of how well the constraints are satisfied at evaluation time, given that they are satisfied at training time. To improve generalization performance, we frame the problem as a two-player game where one player optimizes the model parameters on a training dataset, and the other player enforces the constraints on an independent validation dataset. We build on recent work in two-player constrained optimization to show that if one uses this two-dataset approach, then constraint generalization can be significantly improved. As we illustrate experimentally, this approach works not only in theory, but also in practice.

artificial intelligence, constraint, optimization problem, (20 more...)

arXiv.org Machine Learning

1807.00028

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Proxy Fairness

Gupta, Maya, Cotter, Andrew, Fard, Mahdi Milani, Wang, Serena

arXiv.org Machine LearningJun-28-2018

We consider the problem of improving fairness when one lacks access to a dataset labeled with protected groups, making it difficult to take advantage of strategies that can improve fairness but require protected group labels, either at training or runtime. To address this, we investigate improving fairness metrics for proxy groups, and test whether doing so results in improved fairness for the true sensitive groups. Results on benchmark and real-world datasets demonstrate that such a proxy fairness strategy can work well in practice. However, we caution that the effectiveness likely depends on the choice of fairness metric, as well as how aligned the proxy groups are with the true protected groups in terms of the constrained model parameters.

artificial intelligence, machine learning, proxy group, (17 more...)

arXiv.org Machine Learning

1806.11212

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback