Crowdsourcing has gained immense popularity in machine learning applications for obtaining large amounts of labeled data. Crowdsourcing is cheap and fast, but suffers from the problem of low-quality data. To address this fundamental challenge in crowdsourcing, we propose a simple payment mechanism to incentivize workers to answer only the questions that they are sure of and skip the rest. We show that surprisingly, under a mild and natural no-free-lunch requirement, this mechanism is the one and only incentive-compatible payment mechanism possible. We also show that among all possible incentive-compatible mechanisms (that may or may not satisfy no-free-lunch), our mechanism makes the smallest possible payment to spammers.
Current machine learning techniques enable robust association of biological signals with measured phenotypes, but these approaches are incapable of identifying causal relationships. Here, we develop an integrated "white-box" biochemical screening, network modeling, and machine learning approach for revealing causal mechanisms and apply this approach to understanding antibiotic efficacy. We counter-screen diverse metabolites against bactericidal antibiotics in Escherichia coli and simulate their corresponding metabolic states using a genome-scale metabolic network model. Regression of the measured screening data on model simulations reveals that purine biosynthesis participates in antibiotic lethality, which we validate experimentally. We show that antibiotic-induced adenine limitation increases ATP demand, which elevates central carbon metabolism activity and oxygen consumption, enhancing the killing effects of antibiotics.
The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this paper we revisit the Gaussian mechanism and show that the original analysis has several important limitations. Our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime ($\varepsilon \to 0$) and it cannot be extended to the low privacy regime ($\varepsilon \to \infty$). We address these limitations by developing an optimal Gaussian mechanism whose variance is calibrated directly using the Gaussian cumulative density function instead of a tail bound approximation. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive estimation techniques by leveraging that the distribution of the perturbation is known. Our experiments show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism, and that denoising dramatically improves the accuracy of the Gaussian mechanism in the high-dimensional regime.
The family of Groves mechanisms, which includes the well-known VCG mechanism (also known as the Clarke mechanism), is a family of efficient and strategy-proof mechanisms. Unfortunately, the Groves mechanisms are generally not budget balanced. That is, under such mechanisms, payments may flow into or out of the system of the agents, resulting in deficits or reduced utilities for the agents. We consider the following problem: within the family of Groves mechanisms, we want to identify mechanisms that give the agents the highest utilities, under the constraint that these mechanisms must never incur deficits. We adopt a prior-free approach. We introduce two general measures for comparing mechanisms in prior-free settings. We say that a non-deficit Groves mechanism M individually dominates another non-deficit Groves mechanism M' if for every type profile, every agent's utility under M is no less than that under M', and this holds with strict inequality for at least one type profile and one agent. We say that a non-deficit Groves mechanism M collectively dominates another non-deficit Groves mechanism M' if for every type profile, the agents' total utility under M is no less than that under M', and this holds with strict inequality for at least one type profile. The above definitions induce two partial orders on non-deficit Groves mechanisms. We study the maximal elements corresponding to these two partial orders, which we call the individually undominated mechanisms and the collectively undominated mechanisms, respectively.
Wagering mechanisms are one-shot betting mechanisms that elicit agents' predictions of an event. For deterministic wagering mechanisms, an existing impossibility result has shown incompatibility of some desirable theoretical properties. In particular, Pareto optimality (no profitable side bet before allocation) can not be achieved together with weak incentive compatibility, weak budget balance and individual rationality. In this paper, we expand the design space of wagering mechanisms to allow randomization and ask whether there are randomized wagering mechanisms that can achieve all previously considered desirable properties, including Pareto optimality. We answer this question positively with two classes of randomized wagering mechanisms: i) one simple randomized lottery-type implementation of existing deterministic wagering mechanisms, and ii) another family of simple and randomized wagering mechanisms which we call surrogate wagering mechanisms, which are robust to noisy ground truth. This family of mechanisms builds on the idea of learning with noisy labels (Natarajan et al. 2013) as well as a recent extension of this idea to the information elicitation without verification setting (Liu and Chen 2018). We show that a broad family of randomized wagering mechanisms satisfy all desirable theoretical properties.