Singla, Adish
Actively Learning Hemimetrics with Applications to Eliciting User Preferences
Singla, Adish, Tschiatschek, Sebastian, Krause, Andreas
Motivated by an application of eliciting users' preferences, we investigate the problem of learning hemimetrics, i.e., pairwise distances among a set of $n$ items that satisfy triangle inequalities and non-negativity constraints. In our application, the (asymmetric) distances quantify private costs a user incurs when substituting one item by another. We aim to learn these distances (costs) by asking the users whether they are willing to switch from one item to another for a given incentive offer. Without exploiting structural constraints of the hemimetric polytope, learning the distances between each pair of items requires $\Theta(n^2)$ queries. We propose an active learning algorithm that substantially reduces this sample complexity by exploiting the structural constraints on the version space of hemimetrics. Our proposed algorithm achieves provably-optimal sample complexity for various instances of the task. For example, when the items are embedded into $K$ tight clusters, the sample complexity of our algorithm reduces to $O(n K)$. Extensive experiments on a restaurant recommendation data set support the conclusions of our theoretical analysis.
Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization
Singla, Adish (ETH Zurich) | Tschiatschek, Sebastian (ETH Zurich) | Krause, Andreas (ETH Zurich)
We address the problem of maximizing an unknown submodular function that can only be accessed via noisy evaluations. Our work is motivated by the task of summarizing content, e.g., image collections, by leveraging users' feedback in form of clicks or ratings. For summarization tasks with the goal of maximizing coverage and diversity, submodular set functions are a natural choice. When the underlying submodular function is unknown, users' feedback can provide noisy evaluations of the function that we seek to maximize. We provide a generic algorithm โ ExpGreedy โ for maximizing an unknown submodular function under cardinality constraints. This algorithm makes use of a novel exploration moduleโ TopX โ that proposes good elements based on adaptively sampling noisy function evaluations. TopX is able to accommodate different kinds of observation models such as value queries and pairwise comparisons. We provide PAC-style guarantees on the quality and sampling cost of the solution obtained by ExpGreedy. We demonstrate the effectiveness of our approach in an interactive, crowdsourced image collection summarization application.
Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization
Singla, Adish, Tschiatschek, Sebastian, Krause, Andreas
We address the problem of maximizing an unknown submodular function that can only be accessed via noisy evaluations. Our work is motivated by the task of summarizing content, e.g., image collections, by leveraging users' feedback in form of clicks or ratings. For summarization tasks with the goal of maximizing coverage and diversity, submodular set functions are a natural choice. When the underlying submodular function is unknown, users' feedback can provide noisy evaluations of the function that we seek to maximize. We provide a generic algorithm -- \submM{} -- for maximizing an unknown submodular function under cardinality constraints. This algorithm makes use of a novel exploration module -- \blbox{} -- that proposes good elements based on adaptively sampling noisy function evaluations. \blbox{} is able to accommodate different kinds of observation models such as value queries and pairwise comparisons. We provide PAC-style guarantees on the quality and sampling cost of the solution obtained by \submM{}. We demonstrate the effectiveness of our approach in an interactive, crowdsourced image collection summarization application.
Crowd Access Path Optimization: Diversity Matters
Nushi, Besmira (ETH Zurich) | Singla, Adish (ETH Zurich) | Gruenheid, Anja (ETH Zurich) | Zamanian, Erfan (Brown University) | Krause, Andreas (ETH Zurich) | Kossmann, Donald (ETH Zurich)
Quality assurance is one the most important challenges in crowdsourcing. Assigning tasks to several workers to increase quality through redundant answers can be expensive if asking homogeneous sources. This limitation has been overlooked by current crowdsourcing platforms resulting therefore in costly solutions. In order to achieve desirable cost-quality tradeoffs it is essential to apply efficient crowd access optimization techniques. Our work argues that optimization needs to be aware of diversity and correlation of information within groups of individuals so that crowdsourcing redundancy can be adequately planned beforehand. Based on this intuitive idea, we introduce the Access Path Model (APM), a novel crowd model that leverages the notion of access paths as an alternative way of retrieving information. APM aggregates answers ensuring high quality and meaningful confidence. Moreover, we devise a greedy optimization algorithm for this model that finds a provably good approximate plan to access the crowd. We evaluate our approach on three crowdsourced datasets that illustrate various aspects of the problem. Our results show that the Access Path Model combined with greedy optimization is cost-efficient and practical to overcome common difficulties in large-scale crowdsourcing like data sparsity and anonymity.
Information Gathering in Networks via Active Exploration
Singla, Adish (ETH Zurich) | Horvitz, Eric (Microsoft Research) | Kohli, Pushmeet (Microsoft Research) | White, Ryen (Microsoft Research) | Krause, Andreas (ETH Zurich)
How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm \elgreedy for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.
Information Gathering in Networks via Active Exploration
Singla, Adish, Horvitz, Eric, Kohli, Pushmeet, White, Ryen, Krause, Andreas
How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm NetExp for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.
Incentivizing Users for Balancing Bike Sharing Systems
Singla, Adish (ETH Zurich) | Santoni, Marco (ElectricFeel Mobility Systems) | Bartรณk, Gรกbor (ETH Zurich) | Mukerji, Pratik (ElectricFeel Mobility Systems) | Meenen, Moritz (ElectricFeel Mobility Systems) | Krause, Andreas (ETH Zurich)
Bike sharing systems have been recently adopted by a growing number of cities as a new means of transportation offering citizens a flexible, fast and green alternative for mobility. Users can pick up or drop off the bicycles at a station of their choice without prior notice or time planning. This increased flexibility comes with the challenge of unpredictable and fluctuating demand as well as irregular flow patterns of the bikes. As a result, these systems can incur imbalance problems such as the unavailability of bikes or parking docks at stations. In this light, operators deploy fleets of vehicles which re-distribute the bikes in order to guarantee a desirable service level. Can we engage the users themselves to solve the imbalance problem in bike sharing systems? In this paper, we address this question and present a crowdsourcing mechanism that incentivizes the users in the bike repositioning process by providing them with alternate choices to pick or return bikes in exchange for monetary incentives. We design the complete architecture of the incentives system which employs optimal pricing policies using the approach of regret minimization in online learning. We investigate the incentive compatibility of our mechanism and extensively evaluate it through simulations based on data collected via a survey study. Finally, we deployed the proposed system through a smartphone app among users of a large scale bike sharing system operated by a public transport company, and we provide results from this experimental deployment. To our knowledge, this is the first dynamic incentives system for bikes re-distribution ever deployed in a real-world bike sharing system.
Stochastic Privacy
Singla, Adish (ETH Zurich) | Horvitz, Eric (Microsoft Research) | Kamar, Ece (Microsoft Research) | White, Ryen (Microsoft Research)
Online services such as web search and e-commerce applications typically rely on the collection of data about users, including details of their activities on the web. Such personal data is used to maximize revenues via targeting of advertisements and longer engagements of users, and to enhance the quality of service via personalization of content. To date, service providers have largely followed the approach of either requiring or requesting consent for collecting user data. Users may be willing to share private information in return for incentives, enhanced services, or assurances about the nature and extent of the logged data. We introduce stochastic privacy, an approach to privacy centering on the simple concept of providing people with a guarantee that the probability that their personal data will be shared does not exceed a given bound. Such a probability, which we refer to as the privacy risk, can be given by users as a preference or communicated as a policy by a service provider. Service providers can work to personalize and to optimize revenues in accordance with preferences about privacy risk. We present procedures, proofs, and an overall system for maximizing the quality of services, while respecting bounds on privacy risk. We demonstrate the methodology with a case study and evaluation of the procedures applied to web search personalization. We show how we can achieve near-optimal utility of accessing information with provable guarantees on the probability of sharing data.