Faltings, Boi

Deep Bayesian Trust : A Dominant Strategy and Fair Reward Mechanism for Crowdsourcing

arXiv.org Artificial Intelligence

A common mechanism to assess trust in crowdworkers is to have them answer gold tasks. However, assigning gold tasks to all workers reduces the efficiency of the platform. We propose a mechanism that exploits transitivity so that a worker can be certified as trusted by other trusted workers who solve common tasks. Thus, trust can be derived from a smaller number of gold tasks assignment through multiple layers of peer relationship among the workers, a model we call deep trust. We use the derived trust to incentivize workers for high quality work and show that the resulting mechanism is dominant strategy incentive compatible. We also show that the mechanism satisfies a notion of fairness in that the trust assessment (and thus the reward) of a worker in the limit is independent of the quality of other workers.

Learning in Ad-hoc Anti-coordination Scenarios

AAAI Conferences

We present a brief overview of learning dynamics for anti-coordination in ad-hoc scenarios. Specifically, we consider multi-armed bandit algorithms, reinforcement learning, and symmetric strategies for the repeated resource allocation game. In a multi-agent system with dynamic population where every agent is able to learn, the anti-coordination problem exhibits unique challenges. Thus, it is essential for the success of a joint plan that the agents can quickly and robustly learn their optimal behavior. In this work we will focus on convergence rate, efficiency, and fairness in the final outcome.

Generating Differentially Private Datasets Using GANs

arXiv.org Machine Learning

In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then use the generator component to synthesise privacy-preserving artificial dataset. Our experiments show that under a reasonably small privacy budget we are able to generate data of high quality and successfully train machine learning models on this artificial data.

Partial Truthfulness in Minimal Peer Prediction Mechanisms With Limited Knowledge

AAAI Conferences

We study minimal single-task peer prediction mechanisms that have limited knowledge about agents' beliefs. Without knowing what agents' beliefs are or eliciting additional information, it is not possible to design a truthful mechanism in a Bayesian-Nash sense. We go beyond truthfulness and explore equilibrium strategy profiles that are only partially truthful. Using the results from the multi-armed bandit literature, we give a characterization of how inefficient these equilibria are comparing to truthful reporting. We measure the inefficiency of such strategies by counting the number of dishonest reports that any minimal knowledge-bounded mechanism must have. We show that the order of this number is θ(log n), where n is the number of agents, and we provide a peer prediction mechanism that achieves this bound in expectation.

Information Gathering With Peers: Submodular Optimization With Peer-Prediction Constraints

AAAI Conferences

We study a problem of optimal information gathering from multiple data providers that need to be incentivized to provide accurate information. This problem arises in many real world applications that rely on crowdsourced data sets, but where the process of obtaining data is costly. A notable example of such a scenario is crowd sensing. To this end, we formulate the problem of optimal information gathering as maximization of a submodular function under a budget constraint, where the budget represents the total expected payment to data providers. Contrary to the existing approaches, we base our payments on incentives for accuracy and truthfulness, in particular, peer prediction methods that score each of the selected data providers against its best peer, while ensuring that the minimum expected payment is above a given threshold. We first show that the problem at hand is hard to approximate within a constant factor that is not dependent on the properties of the payment function. However, for given topological and analytical properties of the instance, we construct two greedy algorithms, respectively called PPCGreedy and PPCGreedyIter, and establish theoretical bounds on their performance w.r.t. the optimal solution. Finally, we evaluate our methods using a realistic crowd sensing testbed.

Non-Discriminatory Machine Learning Through Convex Fairness Criteria

AAAI Conferences

Cross-domain data reconstruction methods derive a shared transformation across source and target domains. These methods usually make a specific assumption on noise, which exhibits limited ability when the target data are contaminated by different kinds of complex noise in practice. To enhance the robustness of domain adaptation under severe noise conditions, this paper proposes a novel reconstruction based algorithm in an information-theoretic setting. Specifically, benefiting from the theoretical property of correntropy, the proposed algorithm is distinguished with: detecting the contaminated target samples without making any specific assumption on noise; greatly suppressing the negative influence of noise on cross-domain transformation. Moreover, a relative entropy based regularization of the transformation is incorporated to avoid trivial solutions with the reaped theoretic advantages, i.e., non-negativity and scale-invariance. For optimization, a half-quadratic technique is developed to minimize the non-convex information-theoretic objectives with explicitly guaranteed convergence. Experiments on two real-world domain adaptation tasks demonstrate the superiority of our method.

Incentives for Subjective Evaluations with Private Beliefs

AAAI Conferences

The modern web critically depends on aggregation of information from self-interested agents, for example opinion polls, product ratings, or crowdsourcing. We consider a setting where multiple objects (questions, products, tasks) are evaluated by a group of agents. We first construct a minimal peer prediction mechanism that elicits honest evaluations from a homogeneous population of agents with different private beliefs. Second, we show that it is impossible to strictly elicit honest evaluations from a heterogeneous group of agents with different private beliefs. Nevertheless, we provide a modified version of a divergence-based Bayesian Truth Serum that incentivizes agents to report consistently, making truthful reporting a weak equilibrium of the mechanism.

Acquiring Commonsense Knowledge for Sentiment Analysis through Human Computation

AAAI Conferences

Many Artificial Intelligence tasks need large amounts of commonsense knowledge. Because obtaining this knowledge through machine learning would require a huge amount of data, a better alternative is to elicit it from people through human computation. We consider the sentiment classification task, where knowledge about the contexts that impact word polarities is crucial, but hard to acquire from data. We describe a novel task design that allows us to crowdsource this knowledge through Amazon Mechanical Turk with high quality. We show that the commonsense knowledge acquired in this way dramatically improves the performance of established sentiment classification methods.

Incentives for Truthful Information Elicitation of Continuous Signals

AAAI Conferences

We consider settings where a collective intelligence is formed by aggregating information contributed from many independent agents, such as product reviews, community sensing, or opinion polls. We propose a novel mechanism that elicits both private signals and beliefs. The mechanism extends the previous versions of the Bayesian Truth Serum (the original BTS, the RBTS, and the multi-valued BTS), by allowing small populations and non-binary private signals, while not requiring additional assumptions on the belief updating process. For priors that are sufficiently smooth, such as Gaussians, the mechanism allows signals to be continuous.

A Region-Based Model for Estimating Urban Air Pollution

AAAI Conferences

Air pollution has a direct impact to human health, and data-driven air quality models are useful for evaluating population exposure to air pollutants. In this paper, we propose a novel region-based Gaussian process model for estimating urban air pollution dispersion, and applied it to a large dataset of ultrafine particle (UFP) measurements collected from a network of sensors located on several trams in the city of Zurich. We show that compared to existing grid-based models, the region-based model produces better predictions across aggregates of all time scales. The new model is appropriate for many useful user applications such as exposure assessment and anomaly detection.