AITopics | Wu, Zhiwei Steven

Collaborating Authors

Wu, Zhiwei Steven

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Federated Learning as a Network Effects Game

Hu, Shengyuan, Ngo, Dung Daniel, Zheng, Shuran, Smith, Virginia, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceFeb-16-2023

Federated Learning (FL) aims to foster collaboration among a population of clients to improve the accuracy of machine learning without directly sharing local data. Although there has been rich literature on designing federated learning algorithms, most prior works implicitly assume that all clients are willing to participate in a FL scheme. In practice, clients may not benefit from joining in FL, especially in light of potential costs related to issues such as privacy and computation. In this work, we study the clients' incentives in federated learning to help the service provider design better solutions and ensure clients make better decisions. We are the first to model clients' behaviors in FL as a network effects game, where each client's benefit depends on other clients who also join the network. Using this setup we analyze the dynamics of clients' participation and characterize the equilibrium, where no client has incentives to alter their decision. Specifically, we show that dynamics in the population naturally converge to equilibrium without needing explicit interventions. Finally, we provide a cost-efficient payment scheme that incentivizes clients to reach a desired equilibrium when the initial network is empty.

artificial intelligence, coalition, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.08533

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Confidence-Ranked Reconstruction of Census Microdata from Published Statistics

Dick, Travis, Dwork, Cynthia, Kearns, Michael, Liu, Terrance, Roth, Aaron, Vietri, Giuseppe, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceFeb-6-2023

A reconstruction attack on a private dataset $D$ takes as input some publicly accessible information about the dataset and produces a list of candidate elements of $D$. We introduce a new class of data reconstruction attacks based on randomized methods for non-convex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of $D$ from aggregate query statistics $Q(D)\in \mathbb{R}^m$, but can do so in a way that reliably ranks reconstructed rows by their odds of appearing in the private data, providing a signature that could be used for prioritizing reconstructed rows for further actions such as identify theft or hate crime. We also design a sequence of baselines for evaluating reconstruction attacks. Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset $D$ was sampled, demonstrating that they are exploiting information in the aggregate statistics $Q(D)$, and not simply the overall structure of the distribution. In other words, the queries $Q(D)$ are permitting reconstruction of elements of this dataset, not the distribution from which $D$ was drawn. These findings are established both on 2010 U.S. decennial Census data and queries and Census-derived American Community Survey datasets. Taken together, our methods and experiments illustrate the risks in releasing numerically precise aggregate statistics of a large dataset, and provide further motivation for the careful application of provably private techniques such as differential privacy.

artificial intelligence, baseline, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1073/pnas.2218605120

2211.03128

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.87)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Minimax Optimal Online Imitation Learning via Replay Estimation

Swamy, Gokul, Rajaraman, Nived, Peng, Matthew, Choudhury, Sanjiban, Bagnell, J. Andrew, Wu, Zhiwei Steven, Jiao, Jiantao, Ramchandran, Kannan

arXiv.org Artificial IntelligenceJan-14-2023

Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy. However, in the finite sample regime, even if one has no optimization error, empirical variance can lead to a performance gap that scales with $H^2 / N$ for behavioral cloning and $H / \sqrt{N}$ for online moment matching, where $H$ is the horizon and $N$ is the size of the expert dataset. We introduce the technique of replay estimation to reduce this empirical variance: by repeatedly executing cached expert actions in a stochastic simulator, we compute a smoother expert visitation distribution estimate to match. In the presence of general function approximation, we prove a meta theorem reducing the performance gap of our approach to the parameter estimation error for offline classification (i.e. learning the expert policy). In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by our approach achieves the optimal $\widetilde{O} \left( \min({H^{3/2}} / {N}, {H} / {\sqrt{N}} \right)$ dependency, under significantly weaker assumptions compared to prior work. We implement multiple instantiations of our approach on several continuous control tasks and find that we are able to significantly improve policy performance across a variety of dataset sizes.

artificial intelligence, machine learning, minimax optimal online imitation learning, (1 more...)

arXiv.org Artificial Intelligence

2205.15397

Genre:

Instructional Material > Online (0.60)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Artificial Intelligence > Robots (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)

Add feedback

Sequence Model Imitation Learning with Unobserved Contexts

Swamy, Gokul, Choudhury, Sanjiban, Bagnell, J. Andrew, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceJan-14-2023

We consider imitation learning problems where the learner's ability to mimic the expert increases throughout the course of an episode as more information is revealed. One example of this is when the expert has access to privileged information: while the learner might not be able to accurately reproduce expert behavior early on in an episode, by considering the entire history of states and actions, they might be able to eventually identify the hidden context and act as the expert would. We prove that on-policy imitation learning algorithms (with or without access to a queryable expert) are better equipped to handle these sorts of asymptotically realizable problems than off-policy methods. This is because on-policy algorithms provably learn to recover from their initially suboptimal actions, while off-policy methods treat their suboptimal past actions as though they came from the expert. This often manifests as a latching behavior: a naive repetition of past actions. We conduct experiments in a toy bandit domain that show that there exist sharp phase transitions of whether off-policy approaches are able to match expert performance asymptotically, in contrast to the uniformly good performance of on-policy approaches. We demonstrate that on several continuous control tasks, on-policy approaches are able to use history to identify the context while off-policy approaches actually perform worse when given access to history.

expert system, machine learning, sequence model imitation learning, (2 more...)

arXiv.org Artificial Intelligence

2208.02225

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.53)

Add feedback

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits

Deng, Wesley Hanwen, Nagireddy, Manish, Lee, Michelle Seng Ah, Singh, Jatinder, Wu, Zhiwei Steven, Holstein, Kenneth, Zhu, Haiyi

arXiv.org Artificial IntelligenceJan-10-2023

Recent years have seen the development of many open-source ML fairness toolkits aimed at helping ML practitioners assess and address unfairness in their systems. However, there has been little research investigating how ML practitioners actually use these toolkits in practice. In this paper, we conducted the first in-depth empirical exploration of how industry practitioners (try to) work with existing fairness toolkits. In particular, we conducted think-aloud interviews to understand how participants learn about and use fairness toolkits, and explored the generality of our findings through an anonymous online survey. We identified several opportunities for fairness toolkits to better address practitioner needs and scaffold them in using toolkits effectively and responsibly. Based on these findings, we highlight implications for the design of future open-source fairness toolkits that can support practitioners in better contextualizing, communicating, and collaborating around ML fairness efforts.

artificial intelligence, machine learning, toolkit, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3531146.3533113

2205.06922

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Value Cards: An Educational Toolkit for Teaching Social Impacts of Machine Learning through Deliberation

Shen, Hong, Deng, Wesley Hanwen, Chattopadhyay, Aditi, Wu, Zhiwei Steven, Wang, Xu, Zhu, Haiyi

arXiv.org Artificial IntelligenceJan-10-2023

Recently, there have been increasing calls for computer science curricula to complement existing technical training with topics related to Fairness, Accountability, Transparency, and Ethics. In this paper, we present Value Card, an educational toolkit to inform students and practitioners of the social impacts of different machine learning models via deliberation. This paper presents an early use of our approach in a college-level computer science course. Through an in-class activity, we report empirical data for the initial effectiveness of our approach. Our results suggest that the use of the Value Cards toolkit can improve students' understanding of both the technical definitions and trade-offs of performance metrics and apply them in real-world contexts, help them recognize the significance of considering diverse social values in the development of deployment of algorithmic systems, and enable them to communicate, negotiate and synthesize the perspectives of diverse stakeholders. Our study also demonstrates a number of caveats we need to consider when using the different variants of the Value Cards toolkit. Finally, we discuss the challenges as well as future applications of our approach.

artificial intelligence, machine learning, student, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3442188.3445971

2010.11411

Country:

Europe (0.93)
North America > United States > California (0.46)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Social Sector (1.00)
Education > Curriculum > Subject-Specific Education (0.70)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

Private Synthetic Data for Multitask Learning and Marginal Queries

Vietri, Giuseppe, Archambeau, Cedric, Aydore, Sergul, Brown, William, Kearns, Michael, Roth, Aaron, Siva, Ankit, Tang, Shuai, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceSep-15-2022

We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). A key innovation in our algorithm is the ability to directly handle numerical features, in contrast to a number of related prior approaches which require numerical features to be first converted into {high cardinality} categorical features via {a binning strategy}. Higher binning granularity is required for better accuracy, but this negatively impacts scalability. Eliminating the need for binning allows us to produce synthetic data preserving large numbers of statistical queries such as marginals on numerical features, and class conditional linear threshold queries. Preserving the latter means that the fraction of points of each class label above a particular half-space is roughly the same in both the real and synthetic data. This is the property that is needed to train a linear classifier in a multitask setting. Our algorithm also allows us to produce high quality synthetic data for mixed marginal queries, that combine both categorical and numerical features. Our method consistently runs 2-5x faster than the best comparable techniques, and provides significant accuracy improvements in both marginal queries and linear prediction tasks for mixed-type datasets.

artificial intelligence, machine learning, multitask learning and marginal query, (1 more...)

arXiv.org Artificial Intelligence

2209.074

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Liu, Zuxin, Cen, Zhepeng, Isenbaev, Vladislav, Liu, Wei, Wu, Zhiwei Steven, Li, Bo, Zhao, Ding

arXiv.org Artificial IntelligenceJan-27-2022

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying to safety-critical applications. Primal-dual as a prevalent constrained optimization framework suffers from instability issues and lacks optimality guarantees. This paper overcomes the issues from a novel probabilistic inference perspective and proposes an Expectation-Maximization style approach to learn safe policy. We show that the safe RL problem can be decomposed to 1) a convex optimization phase with a non-parametric variational distribution and 2) a supervised learning phase. We show the unique advantages of constrained variational policy optimization by proving its optimality and policy improvement stability. A wide range of experiments on continuous robotic tasks show that the proposed method achieves significantly better performance in terms of constraint satisfaction and sample efficiency than primal-dual baselines.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2201.11927

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

Raghavan, Manish, Slivkins, Aleksandrs, Vaughan, Jennifer Wortman, Wu, Zhiwei Steven

arXiv.org Machine LearningDec-27-2021

Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users in order to gain information that will lead to better decisions in the future. While necessary in the worst case, explicit exploration has a number of disadvantages compared to the greedy algorithm that always "exploits" by choosing an action that currently looks optimal. We ask under what conditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recent line of work on the smoothed analysis of the greedy algorithm in the linear contextual bandits model. We improve on prior results to show that the greedy algorithm almost matches the best possible Bayesian regret rate of any other algorithm on the same problem instance whenever the diversity conditions hold. The key technical finding is that data collected by the greedy algorithm suffices to simulate a run of any other algorithm.

algorithm, artificial intelligence, educational setting, (17 more...)

arXiv.org Machine Learning

2005.10624

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Education > Educational Setting (0.34)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Of Moments and Matching: Trade-offs and Treatments in Imitation Learning

Swamy, Gokul, Choudhury, Sanjiban, Wu, Zhiwei Steven, Bagnell, J. Andrew

arXiv.org Machine LearningMar-4-2021

We provide a unifying view of a large family of previous imitation learning algorithms through the lens of moment matching. At its core, our classification scheme is based on whether the learner attempts to match (1) reward or (2) action-value moments of the expert's behavior, with each option leading to differing algorithmic approaches. By considering adversarially chosen divergences between learner and expert behavior, we are able to derive bounds on policy performance that apply for all algorithms in each of these classes, the first to our knowledge. We also introduce the notion of recoverability, implicit in many previous analyses of imitation learning, which allows us to cleanly delineate how well each algorithmic family is able to mitigate compounding errors. We derive two novel algorithm templates, AdVIL and AdRIL, with strong guarantees, simple implementation, and competitive empirical performance.

algorithm, artificial intelligence, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2103.03236

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback