Government
Complexity of Shift Bribery in Committee Elections
Bredereck, Robert (Technische Universität Berlin) | Faliszewski, Piotr (AGH University of Science and Technology, Krakow) | Niedermeier, Rolf (Technische Universität Berlin) | Talmon, Nimrod (Technische Universität Berlin)
We study the (parameterized) complexity of Shift Bribery for multiwinner voting rules. We focus on the SNTV, Bloc, k-Borda, and Chamberlin-Courant rules, as well as on approximate variants of the Chamberlin-Courant rule, since the original rule is NP-hard to compute. We show that Shift Bribery tends to be significantly harder in the multiwinner setting than in the single-winner one by showing settings where Shift Bribery is easy in the single-winner cases, but is hard (and hard to approximate) in the multiwinner ones. We show that the non-monotonicity of those rules which are based on approximation algorithms for the Chamberlin--Courant rule sometimes affects the complexity of Shift Bribery.
Text Classification with Heterogeneous Information Network Kernels
Wang, Chenguang (Peking University) | Song, Yangqiu (West Virginia University) | Li, Haoran (Peking University) | Zhang, Ming (Peking University) | Han, Jiawei (University of Illinois at Urbana-Champaign)
Text classification is an important problem with many applications. Traditional approaches represent text as a bag-of-words and build classifiers based on this representation. Rather than words, entity phrases, the relations between the entities, as well as the types of the entities and relations carry much more information to represent the texts. This paper presents a novel text as network classification framework, which introduces 1) a structured and typed heterogeneous information networks (HINs) representation of texts, and 2) a meta-path based approach to link texts. We show that with the new representation and links of texts, the structured and typed information of entities and relations can be incorporated into kernels. Particularly, we develop both simple linear kernel and indefinite kernel based on meta-paths in the HIN representation of texts, where we call them HIN-kernels. Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels.
Multitask Generalized Eigenvalue Program
Wang, Boyu (McGill University) | Pineau, Joelle (McGill University) | Balle, Borja (Lancaster University)
We present a novel multitask learning framework called multitask generalized eigenvalue program (MTGEP), which jointly solves multiple related generalized eigenvalue problems (GEPs). This framework is quite general and can be applied to many eigenvalue problems in machine learning and pattern recognition, ranging from supervised learning to unsupervised learning, such as principal component analysis (PCA), Fisher discriminant analysis (FDA), common spatial pattern (CSP), and so on. The core assumption of our approach is that the leading eigenvectors of related GEPs lie in some subspace that can be approximated by a sparse linear combination of basis vectors. As a result, these GEPs can be jointly solved by a sparse coding approach. Empirical evaluation with both synthetic and benchmark real world datasets validates the efficacy and efficiency of the proposed techniques, especially for grouped multitask GEPs.
Finding One's Best Crowd: Online Learning By Exploiting Source Similarity
Liu, Yang (University of Michigan, Ann Arbor) | Liu, Mingyan (University of Michigan, Ann Arbor)
We consider an online learning problem (classification or prediction) involving disparate sources of sequentially arriving data, whereby a user over time learns the best set of data sources to use in constructing the classifier by exploiting their similarity. We first show that, when (1) the similarity information among data sources is known, and (2) data from different sources can be acquired without cost, then a judicious selection of data from different sources can effectively enlarge the training sample size compared to using a single data source, thereby improving the rate and performance of learning; this is achieved by bounding the classification error of the resulting classifier. We then relax assumption (1) and characterize the loss in learning performance when the similarity information must also be acquired through repeated sampling. We further relax both (1) and (2) and present a cost-efficient algorithm that identifies a best crowd from a potentially large set of data sources in terms of both classifier performance and data acquisition cost. This problem has various applications, including online prediction systems with time series data of various forms, such as financial markets, advertisement and network measurement.
Tracking Idea Flows between Social Groups
Zhong, Yangxin (Tsinghua University) | Liu, Shixia (Tsinghua University) | Wang, Xiting (Tsinghua University) | Xiao, Jiannan (Tsinghua University) | Song, Yangqiu (West Virginia University)
In many applications, ideas that are described by a set of words often flow between different groups. To facilitate users in analyzing the flow, we present a method to model the flow behaviors that aims at identifying the lead-lag relationships between word clusters of different user groups. In particular, an improved Bayesian conditional cointegration based on dynamic time warping is employed to learn links between words in different groups. A tensor-based technique is developed to cluster these linked words into different clusters (ideas) and track the flow of ideas. The main feature of the tensor representation is that we introduce two additional dimensions to represent both time and lead-lag relationships. Experiments on both synthetic and real datasets show that our method is more effective than methods based on traditional clustering techniques and achieves better accuracy. A case study was conducted to demonstrate the usefulness of our method in helping users understand the flow of ideas between different user groups on social media.
Personalized Alert Agent for Optimal User Performance
Shvartzon, Avraham (Bar Ilan University) | Azaria, Amos (Carnegie Mellon University) | Kraus, Sarit (Bar Ilan University) | Goldman, Claudia V. (General Motors, Herzeliya) | Meyer, Joachim (Tel Aviv University) | Tsimhoni, Omer (General Motors)
Preventive maintenance is essential for the smooth operation of any equipment. Still, people occasionally do not maintain their equipment adequately. Maintenance alert systems attempt to remind people to perform maintenance. However, most of these systems do not provide alerts at the optimal timing, and nor do they take into account the time required for maintenance or compute the optimal timing for a specific user. We model the problem of maintenance performance, assuming maintenance is time consuming. We solve the optimal policy for the user, i.e., the optimal timing for a user to perform maintenance. This optimal strategy depends on the value of user's time, and thus it may vary from user to user and may change over time. %We present a game Based on the solved optimal strategy we present a personalized maintenance agent, which, depending on the value of user's time, provides alerts to the user when she should perform maintenance. In an experiment using a spaceship computer game, we show that receiving alerts from the personalized alert agent significantly improves user performance.
Solving the Station Repacking Problem
Fréchette, Alexandre (University of British Columbia) | Newman, Neil (University of British Columbia) | Leyton-Brown, Kevin (University of British Columbia)
We investigate the problem of repacking stations in the FCC's upcoming, multi-billion-dollar "incentive auction". Early efforts to solve this problem considered mixed-integer programming formulations, which we show are unable to reliably solve realistic, national-scale problem instances. We describe the result of a multi-year investigation of alternatives: a solver, SATFC, that has been adopted by the FCC for use in the incentive auction. SATFC is based on a SAT encoding paired with a wide range of techniques: constraint graph decomposition; novel caching mechanisms that allow for reuse of partial solutions from related, solved problems; algorithm configuration; algorithm portfolios; and the marriage of local-search and complete solver strategies. We show that our approach solves virtually all of a set of problems derived from auction simulations within the short time budget required in practice.
Optimal Aggregation of Uncertain Preferences
Procaccia, Ariel D. (Carnegie Mellon University) | Shah, Nisarg (Carnegie Mellon University)
A paradigmatic problem in social choice theory deals with the aggregation of subjective preferences of individuals --- represented as rankings of alternatives --- into a social ranking. We are interested in settings where individuals are uncertain about their own preferences, and represent their uncertainty as distributions over rankings. Under the classic objective of minimizing the (expected) sum of Kendall tau distances between the input rankings and the output ranking, we establish that preference elicitation is surprisingly straightforward and near-optimal solutions can be obtained in polynomial time. We show, both in theory and using real data, that ignoring uncertainty altogether can lead to suboptimal outcomes.
Preferences Single-Peaked on Nice Trees
Peters, Dominik (University of Oxford) | Elkind, Edith (University of Oxford )
Preference profiles that are single-peaked on trees enjoy desirable properties: they admit a Condorcet winner (Demange 1982), and there are hard voting problems that become tractable on this domain (Yu et al., 2013). Trick (1989) proposed a polynomial-time algorithm that finds some tree with respect to which a given preference profile is single-peaked. However, some voting problems are only known to be easy for profiles that are single-peaked on "nice" trees, and Trick's algorithm provides no guarantees on the properties of the tree that it outputs. To overcome this issue, we build on the work of Trick and Yu et al. to develop a structural approach that enables us to compactly represent all trees with respect to which a given profile is single-peaked. We show how to use this representation to efficiently find the "best" tree for a given profile, according to a number of criteria; for other criteria, we obtain NP-hardness results. In particular, we show that it is NP-hard to decide whether an input profile is single-peaked with respect to a given tree. To demonstrate the applicability of our framework, we use it to identify a new class of profiles that admit an efficient algorithm for a popular variant of the Chamberlin-Courant rule.
Reinstating Combinatorial Protections for Manipulation and Bribery in Single-Peaked and Nearly Single-Peaked Electorates
Menon, Vijay (University of Waterloo) | Larson, Kate (University of Waterloo)
Understanding when and how computational complexity can be used to protect elections against different manipulative actions has been a highly active research area over the past two decades. A recent body of work, however, has shown that many of the NP-hardness shields, previously obtained, vanish when the electorate has single-peaked or nearly single-peaked preferences. In light of these results, we investigate whether it is possible to reimpose NP-hardness shields for such electorates by allowing the voters to specify partial preferences instead of insisting they cast complete ballots. In particular, we show that in single-peaked and nearly single-peaked electorates, if voters are allowed to submit top-truncated ballots, then the complexity of manipulation and bribery for many voting rules increases from being in P to being NP-complete.