AITopics | Roth, Aaron

Collaborating Authors

Roth, Aaron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis

Rogers, Ryan, Roth, Aaron, Smith, Adam, Srebro, Nathan, Thakkar, Om, Woodworth, Blake

arXiv.org Machine LearningJun-21-2019

We design a general framework for answering adaptive statistical queries that focuses on providing explicit confidence intervals along with point estimates. Prior work in this area has either focused on providing tight confidence intervals for specific analyses, or providing general worst-case bounds for point estimates. Unfortunately, as we observe, these worst-case bounds are loose in many settings --- often not even beating simple baselines like sample splitting. Our main contribution is to design a framework for providing valid, instance-specific confidence intervals for point estimates that can be generated by heuristics. When paired with good heuristics, this method gives guarantees that are orders of magnitude better than the best worst-case bounds. We provide a Python library implementing our method.

artificial intelligence, machine learning, mechanism, (16 more...)

arXiv.org Machine Learning

1906.09231

Country: North America > United States > Nevada (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.83)

Add feedback

Gaussian Differential Privacy

Dong, Jinshuo, Roth, Aaron, Su, Weijie J.

arXiv.org Machine LearningMay-30-2019

Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term `$f$-differential privacy' ($f$-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, $f$-DP preserves the hypothesis testing interpretation. In addition, $f$-DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for original DP to $f$-DP and, as an application, obtain a simple subsampling theorem for $f$-DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the $f$-DP class that is referred to as `Gaussian differential privacy' (GDP), defined based on testing two shifted Gaussians. GDP is focal among the $f$-DP class because of a central limit theorem we prove. More precisely, the privacy guarantees of \emph{any} hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition. The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms. Taken together, this collection of attractive properties render $f$-DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.

artificial intelligence, health & medicine, tradeoff function, (18 more...)

arXiv.org Machine Learning

1905.02383

Country: North America > United States (0.92)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

Eliciting and Enforcing Subjective Individual Fairness

Jung, Christopher, Kearns, Michael, Neel, Seth, Roth, Aaron, Stapleton, Logan, Wu, Zhiwei Steven

arXiv.org Machine LearningMay-25-2019

We revisit the notion of individual fairness first proposed by Dwork et al. [2012], which asks that "similar individuals should be treated similarly". A primary difficulty with this definition is that it assumes a completely specified fairness metric for the task at hand. In contrast, we consider a framework for fairness elicitation, in which fairness is indirectly specified only via a sample of pairs of individuals who should be treated (approximately) equally on the task. We make no assumption that these pairs are consistent with any metric. We provide a provably convergent oracle-efficient algorithm for minimizing error subject to the fairness constraints, and prove generalization theorems for both accuracy and fairness. Since the constrained pairs could be elicited either from a panel of judges, or from particular individuals, our framework provides a means for algorithmically enforcing subjective notions of fairness. We report on preliminary findings of a behavioral study of subjective fairness using human-subject fairness constraints elicited on the COMPAS criminal recidivism dataset.

constraint, law enforcement, optimization problem, (22 more...)

arXiv.org Machine Learning

1905.1066

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Law (0.46)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Average Individual Fairness: Algorithms, Generalization and Experiments

Kearns, Michael, Roth, Aaron, Sharifi-Malvajerdi, Saeed

arXiv.org Machine LearningMay-25-2019

We propose a new family of fairness definitions for classification problems that combine some of the best properties of both statistical and individual notions of fairness. We posit not only a distribution over individuals, but also a distribution over (or collection of) classification tasks. We then ask that standard statistics (such as error or false positive/negative rates) be (approximately) equalized across individuals, where the rate is defined as an expectation over the classification tasks. Because we are no longer averaging over coarse groups (such as race or gender), this is a semantically meaningful individual-level constraint. Given a sample of individuals and classification problems, we design an oracle-efficient algorithm (i.e. one that is given access to any standard, fairness-free learning heuristic) for the fair empirical risk minimization task. We also show that given sufficiently many samples, the ERM solution generalizes in two directions: both to new individuals, and to new classification tasks, drawn from their corresponding distributions. Finally we implement our algorithm and empirically verify its effectiveness.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1905.10607

Country: North America > United States (0.92)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)

Add feedback

The Role of Interactivity in Local Differential Privacy

Joseph, Matthew, Mao, Jieming, Neel, Seth, Roth, Aaron

arXiv.org Machine LearningApr-6-2019

We study the power of interactivity in local differential privacy. First, we focus on the difference between fully interactive and sequentially interactive protocols. Sequentially interactive protocols may query users adaptively in sequence, but they cannot return to previously queried users. The vast majority of existing lower bounds for local differential privacy apply only to sequentially interactive protocols, and before this paper it was not known whether fully interactive protocols were more powerful. We resolve this question. First, we classify locally private protocols by their compositionality, the multiplicative factor $k \geq 1$ by which the sum of a protocol's single-round privacy parameters exceeds its overall privacy guarantee. We then show how to efficiently transform any fully interactive $k$-compositional protocol into an equivalent sequentially interactive protocol with an $O(k)$ blowup in sample complexity. Next, we show that our reduction is tight by exhibiting a family of problems such that for any $k$, there is a fully interactive $k$-compositional protocol which solves the problem, while no sequentially interactive protocol can solve the problem without at least an $\tilde \Omega(k)$ factor more examples. We then turn our attention to hypothesis testing problems. We show that for a large class of compound hypothesis testing problems --- which include all simple hypothesis testing problems as a special case --- a simple noninteractive test is optimal among the class of all (possibly fully interactive) tests.

artificial intelligence, machine learning, protocol, (16 more...)

arXiv.org Machine Learning

1904.03564

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Equal Opportunity in Online Classification with Partial Feedback

Bechavod, Yahav, Ligett, Katrina, Roth, Aaron, Waggoner, Bo, Wu, Zhiwei Steven

arXiv.org Machine LearningFeb-6-2019

We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates --- introduced as "equal opportunity" in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.

artificial intelligence, constraint, data mining, (18 more...)

arXiv.org Machine Learning

1902.02242

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.85)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Online Learning with an Unknown Fairness Metric

Gillen, Stephen, Jung, Christopher, Kearns, Michael, Roth, Aaron

Neural Information Processing SystemsDec-31-2018

We consider the problem of online learning in the linear contextual bandits setting, but in which there are also strong individual fairness constraints governed by an unknown similarity metric. These constraints demand that we select similar actions or individuals with approximately equal probability DHPRZ12, which may be at odds with optimizing reward, thus modeling settings where profit and social policy are in tension. We assume we learn about an unknown Mahalanobis similarity metric from only weak feedback that identifies fairness violations, but does not quantify their extent. This is intended to represent the interventions of a regulator who "knows unfairness when he sees it" but nevertheless cannot enunciate a quantitative fairness metric over individuals. Our main result is an algorithm in the adversarial context setting that has a number of fairness violations that depends only logarithmically on T, while obtaining an optimal O(sqrt(T)) regret bound to the best fair policy.

algorithm, computer based training, educational technology, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem

Kannan, Sampath, Morgenstern, Jamie H., Roth, Aaron, Waggoner, Bo, Wu, Zhiwei Steven

Neural Information Processing SystemsDec-31-2018

Bandit learning is characterized by the tension between long-term exploration and short-term exploitation. However, as has recently been noted, in settings in which the choices of the learning algorithm correspond to important decisions about individual people (such as criminal recidivism prediction, lending, and sequential drug trials), exploration corresponds to explicitly sacrificing the well-being of one individual for the potential future benefit of others. In such settings, one might like to run a ``greedy'' algorithm, which always makes the optimal decision for the individuals at hand --- but doing this can result in a catastrophic failure to learn. In this paper, we consider the linear contextual bandit problem and revisit the performance of the greedy algorithm. We give a smoothed analysis, showing that even when contexts may be chosen by an adversary, small perturbations of the adversary's choices suffice for the algorithm to achieve ``no regret'', perhaps (depending on the specifics of the setting) with a constant amount of initial training data. This suggests that in slightly perturbed environments, exploration and exploitation need not be in conflict in the linear setting.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Information Technology > Data Science > Data Mining > Big Data (0.70)

Add feedback

Local Differential Privacy for Evolving Data

Joseph, Matthew, Roth, Aaron, Ullman, Jonathan, Waggoner, Bo

Neural Information Processing SystemsDec-31-2018

There are now several large scale deployments of differential privacy used to collect statistical information about users. However, these deployments periodically recollect the data and recompute the statistics using algorithms designed for a single use. As a result, these systems do not provide meaningful privacy guarantees over long time scales. Moreover, existing techniques to mitigate this effect do not apply in the ``local model'' of differential privacy that these systems use. In this paper, we introduce a new technique for local differential privacy that makes it possible to maintain up-to-date statistics over time, with privacy guarantees that degrade only in the number of changes in the underlying distribution rather than the number of collection periods. We use our technique for tracking a changing statistic in the setting where users are partitioned into an unknown collection of groups, and at every time period each user draws a single bit from a common (but changing) group-specific distribution. We also provide an application to frequency and heavy-hitter estimation.

artificial intelligence, differential privacy, privacy, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Information Technology > Security & Privacy (1.00)

Technology: