Goto

Collaborating Authors

 Roth, Aaron


Gait Design of a Novel Arboreal Concertina Locomotion for Snake-like Robots

arXiv.org Artificial Intelligence

Abstract--In this paper, we propose a novel strategy for a snake robot to move straight up a cylindrical surface. Prior works on pole-climbing for a snake robot mainly utilized a rolling helix gait, and although proven to be efficient, it does not reassemble movements made by a natural snake. We take inspiration from nature and seek to imitate the Arboreal Concertina Locomotion (ACL) from real-life serpents. In order to represent the 3D curves that make up the key motion patterns of ACL, we establish a set of parametric equations that identify periodic functions, which produce a sequence of backbone curves. We then build up the gait equation using the curvature integration method, and finally, we propose a simple motion estimation strategy using virtual chassis and non-slip model assumptions.


Oracle Efficient Online Multicalibration and Omniprediction

arXiv.org Artificial Intelligence

A recent line of work has shown a surprising connection between multicalibration, a multi-group fairness notion, and omniprediction, a learning paradigm that provides simultaneous loss minimization guarantees for a large family of loss functions. Prior work studies omniprediction in the batch setting. We initiate the study of omniprediction in the online adversarial setting. Although there exist algorithms for obtaining notions of multicalibration in the online adversarial setting, unlike batch algorithms, they work only for small finite classes of benchmark functions $F$, because they require enumerating every function $f \in F$ at every round. In contrast, omniprediction is most interesting for learning theoretic hypothesis classes $F$, which are generally continuously large. We develop a new online multicalibration algorithm that is well defined for infinite benchmark classes $F$, and is oracle efficient (i.e. for any class $F$, the algorithm has the form of an efficient reduction to a no-regret learning algorithm for $F$). The result is the first efficient online omnipredictor -- an oracle efficient prediction algorithm that can be used to simultaneously obtain no regret guarantees to all Lipschitz convex loss functions. For the class $F$ of linear functions, we show how to make our algorithm efficient in the worst case. Also, we show upper and lower bounds on the extent to which our rates can be improved: our oracle efficient algorithm actually promises a stronger guarantee called swap-omniprediction, and we prove a lower bound showing that obtaining $O(\sqrt{T})$ bounds for swap-omniprediction is impossible in the online setting. On the other hand, we give a (non-oracle efficient) algorithm which can obtain the optimal $O(\sqrt{T})$ omniprediction bounds without going through multicalibration, giving an information theoretic separation between these two solution concepts.


Scalable Membership Inference Attacks via Quantile Regression

arXiv.org Artificial Intelligence

The basic goal of privacy-preserving machine learning is to find models that are predictive on some underlying data distribution, without being disclosive of the particular data points on which they were trained. The simplest kind of attack that can be launched on a trained model--falsifying privacy guarantees--is a membership inference attack. A membership inference attack, informally, is a statistical test that is able to reliably determine whether a particular data point was included in the training set used to train the model or not. Almost all membership inference attacks are based on the observation that models tend to overfit their training sets in different ways. In particular, they tend to systematically predict higher confidence in the true labels of data points from their training set, compared to points drawn from the same distribution not in their training set. The confidence that a model places on the true label of a data-point is thus a natural test statistic to build a membership-inference hypothesis test around. A variety of recent methods [Shokri et al., 2017, Long et al., 2020, Sablayrolles et al., 2019, Song and Mittal, 2021, Carlini et al., 2022] are based around this idea, and aim to estimate the distribution of the test statistic (the confidence assigned to the true label of a datapoint) over the distribution of datapoints that were not used in training (and sometimes, Martin and Shuai are lead authors; all other authors are listed in alphabetical order.


Balanced Filtering via Non-Disclosive Proxies

arXiv.org Artificial Intelligence

We study the problem of non-disclosively collecting a sample of data that is balanced with respect to sensitive groups when group membership is unavailable or prohibited from use at collection time. Specifically, our collection mechanism does not reveal significantly more about group membership of any individual sample than can be ascertained from base rates alone. To do this, we adopt a fairness pipeline perspective, in which a learner can use a small set of labeled data to train a proxy function that can later be used for this filtering task. We then associate the range of the proxy function with sampling probabilities; given a new candidate, we classify it using our proxy function, and then select it for our sample with probability proportional to the sampling probability corresponding to its proxy classification. Importantly, we require that the proxy classification itself not reveal significant information about the sensitive group membership of any individual sample (i.e., it should be sufficiently non-disclosive). We show that under modest algorithmic assumptions, we find such a proxy in a sample- and oracle-efficient manner. Finally, we experimentally evaluate our algorithm and analyze generalization properties.


Improved Differentially Private Regression via Gradient Boosting

arXiv.org Artificial Intelligence

We revisit the problem of differentially private squared error linear regression. We observe that existing state-of-the-art methods are sensitive to the choice of hyperparameters -- including the ``clipping threshold'' that cannot be set optimally in a data-independent way. We give a new algorithm for private linear regression based on gradient boosting. We show that our method consistently improves over the previous state of the art when the clipping threshold is taken to be fixed without knowledge of the data, rather than optimized in a non-private way -- and that even when we optimize the hyperparameters of competitor algorithms non-privately, our algorithm is no worse and often better. In addition to a comprehensive set of experiments, we give theoretical insights to explain this behavior.


Reconciling Individual Probability Forecasts

arXiv.org Artificial Intelligence

Probabilistic modelling in machine learning and statistics predicts "individual probabilities" as a matter of course. In weather forecasting, we speak of the probability of rain tomorrow; in life insurance underwriting we speak of the probability that Alice will die in the next 12 months; in recidivism prediction we speak of the probability that an inmate Bob will commit a violent crime within 18 months of being released on parole; in predictive medicine we speak of the probability that Carol will develop breast cancer before the age of 50 -- and so on. But these are not repeated events: we have no way of directly measuring an "individual probability" -- and indeed, even the semantics of an individual probability are unclear and have been the subject of deep interrogation within the philosophy of science and statistics [Hájek, 2007, Dawid, 2017] and theoretical computer science [Dwork et al., 2021]. Within the philosophy of science, puzzles related to individual probability have been closely identified with "the reference class problem" [Hájek, 2007]. This is a close cousin of a concern that has recently arisen in the context of fairness in machine learning called the "predictive multiplicity problem" (a focal subset of "model multiplicity problems") [Marx et al., 2020, Black et al., 2022] which Breiman [2001] earlier called the "Rashomon Effect". At the core of both of these problems is the fact that from a data sample that is much smaller than the data universe (i.e. the set of all possible observations), we will have observed at most one individual with a particular set of characteristics, and at most one outcome for the event that an"individual probability" speaks to: It will either rain tomorrow or it will not; Alice will either die within the next year or she will not; etc. We do not have the luxury of observing a large number of repetitions and taking averages. Dawid[2017] lays out two broad classes of perspectives on individual probabilities: the group to individual perspective and the individual to group perspective.


The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation

arXiv.org Artificial Intelligence

We make a connection between multicalibration and property elicitation and show that (under mild technical conditions) it is possible to produce a multicalibrated predictor for a continuous scalar distributional property $\Gamma$ if and only if $\Gamma$ is elicitable. On the negative side, we show that for non-elicitable continuous properties there exist simple data distributions on which even the true distributional predictor is not calibrated. On the positive side, for elicitable $\Gamma$, we give simple canonical algorithms for the batch and the online adversarial setting, that learn a $\Gamma$-multicalibrated predictor. This generalizes past work on multicalibrated means and quantiles, and in fact strengthens existing online quantile multicalibration results. To further counter-weigh our negative result, we show that if a property $\Gamma^1$ is not elicitable by itself, but is elicitable conditionally on another elicitable property $\Gamma^0$, then there is a canonical algorithm that jointly multicalibrates $\Gamma^1$ and $\Gamma^0$; this generalizes past work on mean-moment multicalibration. Finally, as applications of our theory, we provide novel algorithmic and impossibility results for fair (multicalibrated) risk assessment.


Confidence-Ranked Reconstruction of Census Microdata from Published Statistics

arXiv.org Artificial Intelligence

A reconstruction attack on a private dataset $D$ takes as input some publicly accessible information about the dataset and produces a list of candidate elements of $D$. We introduce a new class of data reconstruction attacks based on randomized methods for non-convex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of $D$ from aggregate query statistics $Q(D)\in \mathbb{R}^m$, but can do so in a way that reliably ranks reconstructed rows by their odds of appearing in the private data, providing a signature that could be used for prioritizing reconstructed rows for further actions such as identify theft or hate crime. We also design a sequence of baselines for evaluating reconstruction attacks. Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset $D$ was sampled, demonstrating that they are exploiting information in the aggregate statistics $Q(D)$, and not simply the overall structure of the distribution. In other words, the queries $Q(D)$ are permitting reconstruction of elements of this dataset, not the distribution from which $D$ was drawn. These findings are established both on 2010 U.S. decennial Census data and queries and Census-derived American Community Survey datasets. Taken together, our methods and experiments illustrate the risks in releasing numerically precise aggregate statistics of a large dataset, and provide further motivation for the careful application of provably private techniques such as differential privacy.


Multicalibration as Boosting for Regression

arXiv.org Artificial Intelligence

We revisit the problem of boosting for regression, and develop a new agnostic regression boosting algorithm via a connection to multicalibration. In doing so, we shed additional light on multicalibration, a recent learning objective that has emerged from the algorithmic fairness literature [Hébert-Johnson et al., 2018]. In particular, we characterize multicalibration in terms of a "swap-regret" like condition, and use it to answer the question "what property must a collection of functions H have so that multicalibration with respect to H implies Bayes optimality?", giving a complete answer to problem asked by Burhanpurkar et al. [2021]. Using our swap-regret characterization, we derive an especially simple algorithm for learning a multicalibrated predictor for a class of functions H by reduction to a standard squared-error regression algorithm for H. The same algorithm can also be analyzed as a boosting algorithm for squared error regression that makes calls to a weak learner for squared error regression on subsets of the original data distribution without the need to relabel examples (in contrast to Gradient Boosting as well as existing multicalibration algorithms). This lets us specify a weak learning condition that is sufficient for convergence to the Bayes optimal predictor (even if the Bayes optimal predictor does not have zero error), avoiding the kinds of realizability assumptions that are implicit in analyses of boosting algorithms that converge to zero error. We conclude that ensuring multicalibration with respect to H corresponds to boosting for squared error regression in which H forms the set of weak learners. Finally we define a weak learning condition for H relative to a constrained class of functions C (rather than with respect to the Bayes optimal predictor). We show that multicalibration with respect to H implies multicalibration with respect to C if H satisfies the weak learning condition with respect to C, which in turn implies accuracy at least that of the best function in C. Multicalibration Consider a distribution D P Z defined over a domain Z " X ˆ R of feature vectors x P X paired with real valued labels y.


Private Synthetic Data for Multitask Learning and Marginal Queries

arXiv.org Artificial Intelligence

We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). A key innovation in our algorithm is the ability to directly handle numerical features, in contrast to a number of related prior approaches which require numerical features to be first converted into {high cardinality} categorical features via {a binning strategy}. Higher binning granularity is required for better accuracy, but this negatively impacts scalability. Eliminating the need for binning allows us to produce synthetic data preserving large numbers of statistical queries such as marginals on numerical features, and class conditional linear threshold queries. Preserving the latter means that the fraction of points of each class label above a particular half-space is roughly the same in both the real and synthetic data. This is the property that is needed to train a linear classifier in a multitask setting. Our algorithm also allows us to produce high quality synthetic data for mixed marginal queries, that combine both categorical and numerical features. Our method consistently runs 2-5x faster than the best comparable techniques, and provides significant accuracy improvements in both marginal queries and linear prediction tasks for mixed-type datasets.