Faltings, Boi

Generating Differentially Private Datasets Using GANs

arXiv.org Machine Learning

In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then use the generator component to synthesise privacy-preserving artificial dataset. Our experiments show that under a reasonably small privacy budget we are able to generate data of high quality and successfully train machine learning models on this artificial data.

Incentives for Subjective Evaluations with Private Beliefs

AAAI Conferences

The modern web critically depends on aggregation of information from self-interested agents, for example opinion polls, product ratings, or crowdsourcing. We consider a setting where multiple objects (questions, products, tasks) are evaluated by a group of agents. We first construct a minimal peer prediction mechanism that elicits honest evaluations from a homogeneous population of agents with different private beliefs. Second, we show that it is impossible to strictly elicit honest evaluations from a heterogeneous group of agents with different private beliefs. Nevertheless, we provide a modified version of a divergence-based Bayesian Truth Serum that incentivizes agents to report consistently, making truthful reporting a weak equilibrium of the mechanism.

Acquiring Commonsense Knowledge for Sentiment Analysis through Human Computation

AAAI Conferences

Many Artificial Intelligence tasks need large amounts of commonsense knowledge. Because obtaining this knowledge through machine learning would require a huge amount of data, a better alternative is to elicit it from people through human computation. We consider the sentiment classification task, where knowledge about the contexts that impact word polarities is crucial, but hard to acquire from data. We describe a novel task design that allows us to crowdsource this knowledge through Amazon Mechanical Turk with high quality. We show that the commonsense knowledge acquired in this way dramatically improves the performance of established sentiment classification methods.

A Region-Based Model for Estimating Urban Air Pollution

AAAI Conferences

Air pollution has a direct impact to human health, and data-driven air quality models are useful for evaluating population exposure to air pollutants. In this paper, we propose a novel region-based Gaussian process model for estimating urban air pollution dispersion, and applied it to a large dataset of ultrafine particle (UFP) measurements collected from a network of sensors located on several trams in the city of Zurich. We show that compared to existing grid-based models, the region-based model produces better predictions across aggregates of all time scales. The new model is appropriate for many useful user applications such as exposure assessment and anomaly detection.

Incentives for Truthful Information Elicitation of Continuous Signals

AAAI Conferences

We consider settings where a collective intelligence is formed by aggregating information contributed from many independent agents, such as product reviews, community sensing, or opinion polls. We propose a novel mechanism that elicits both private signals and beliefs. The mechanism extends the previous versions of the Bayesian Truth Serum (the original BTS, the RBTS, and the multi-valued BTS), by allowing small populations and non-binary private signals, while not requiring additional assumptions on the belief updating process. For priors that are sufficiently smooth, such as Gaussians, the mechanism allows signals to be continuous.

Protecting Privacy through Distributed Computation in Multi-agent Decision Making

arXiv.org Artificial Intelligence

As large-scale theft of data from corporate servers is becoming increasingly common, it becomes interesting to examine alternatives to the paradigm of centralizing sensitive data into large databases. Instead, one could use cryptography and distributed computation so that sensitive data can be supplied and processed in encrypted form, and only the final result is made known. In this paper, we examine how such a paradigm can be used to implement constraint satisfaction, a technique that can solve a broad class of AI problems such as resource allocation, planning, scheduling, and diagnosis. Most previous work on privacy in constraint satisfaction only attempted to protect specific types of information, in particular the feasibility of particular combinations of decisions. We formalize and extend these restricted notions of privacy by introducing four types of private information, including the feasibility of decisions and the final decisions made, but also the identities of the participants and the topology of the problem. We present distributed algorithms that allow computing solutions to constraint satisfaction problems while maintaining these four types of privacy. We formally prove the privacy properties of these algorithms, and show experiments that compare their respective performance on benchmark problems.

Recommendation Using Textual Opinions

AAAI Conferences

Many web sites collect reviews of products and services and use them provide rankings of their quality. However, such rankings are not personalized. We investigate how the information in the reviews written by a particular user can be used to personalize the ranking she is shown. We propose a new technique, topic profile collaborative filtering, where we build user profiles from users' review texts and use these profiles to filter other review texts with the eyes of this user. We verify on data from an actual review site that review texts and topic profiles indeed correlate with ratings, and show that topic profile collaborative filtering provides both a better mean average error when predicting ratings and a better approximation of user preference orders.

Sensing the Air We Breathe — The OpenSense Zurich Dataset

AAAI Conferences

Monitoring and managing urban air pollution is a significant challenge for the sustainability of our environment. We quickly survey the air pollution modeling problem,introduce a new dataset of mobile air quality measurements in Zurich, and discuss the challenges of making sense of these data.

DUCT: An Upper Confidence Bound Approach to Distributed Constraint Optimization Problems

AAAI Conferences

The Upper Confidence Bounds (UCB) algorithm is a well-known near-optimal strategy for the stochastic multi-armed bandit problem. Its extensions to trees, such as the Upper Confidence Tree (UCT) algorithm, have resulted in good solutions to the problem of Go. This paper introduces DUCT, a distributed algorithm inspired by UCT, for solving Distributed Constraint Optimization Problems (DCOP). Bounds on the solution quality are provided, and experiments show that, compared to existing DCOP approaches, DUCT is able to solve very large problems much more efficiently, or to find significantly higher quality solutions.

Symmetric Subgame Perfect Equilibria in Resource Allocation

AAAI Conferences

We analyze symmetric protocols to rationally coordinate on an asymmetric, efficient allocation in an infinitely repeated N-agent, C-resource allocation problems. (Bhaskar 2000) proposed one way to achieve this in 2-agent, 1-resource allocation games: Agents start by symmetrically randomizing their actions, and as soon as they each choose different actions, they start to follow a potentially asymmetric "convention" that prescribes their actions from then on. We extend the concept of convention to the general case of infinitely repeated resource allocation games with N agents and C resources. We show that for any convention, there exists a symmetric subgame perfect equilibrium which implements it. We present two conventions: bourgeois, where agents stick to the first allocation; and market, where agents pay for the use of resources, and observe a global coordination signal which allows them to alternate between different allocations. We define price of anonymity of a convention as the ratio between the maximum social payoff of any (asymmetric) strategy profile and the expected social payoff of the convention. We show that while the price of anonymity of the bourgeois convention is infinite, the market convention decreases this price by reducing the conflict between the agents.