Oceania
Complexity of and Algorithms for Borda Manipulation
Davies, Jessica (University of Toronto) | Katsirelos, George (LRI, Universite Paris Sud 11) | Narodytska, Nina (NICTA and University of New South Wales) | Walsh, Toby (NICTA and University of New South Wales)
We prove that it is NP-hard for a coalition of two manipulators to compute how to manipulate the Borda voting rule. This resolves one of the last open problems in the computational complexity of manipulating common voting rules. Because of this NP-hardness, we treat computing a manipulation as an approximation problem where we try to minimize the number of manipulators. Based on ideas from bin packing and multiprocessor scheduling, we propose two new approximation methods to compute manipulations of the Borda rule. Experiments show that these methods significantly outperform the previous best known approximation method. We are able to find optimal manipulations in almost all the randomly generated elections tested. Our results suggest that, whilst computing a manipulation of the Borda rule by a coalition is NP-hard, computational complexity may provide only a weak barrier against manipulation in practice.
Dominating Manipulations in Voting with Partial Information
Conitzer, Vincent (Duke University) | Walsh, Toby (NICTA and University of New South Wales) | Xia, Lirong (Duke University)
We consider manipulation problems when the manipulator only has partial information about the votes of the non-manipulators. Such partial information is described by an {\em information set}, which is the set of profiles of the non-manipulators that are indistinguishable to the manipulator. Given such an information set, a {\em dominating manipulation} is a non-truthful vote that the manipulator can cast which makes the winner at least as preferable (and sometimes more preferable) as the winner when the manipulator votes truthfully. When the manipulator has full information, computing whether or not there exists a dominating manipulation is in P for many common voting rules (by known results). We show that when the manipulator has no information, there is no dominating manipulation for many common voting rules. When the manipulator's information is represented by partial orders and only a small portion of the preferences are unknown, computing a dominating manipulation is NP-hard for many common voting rules. Our results thus throw light on whether we can prevent strategic behavior by limiting information about the votes of other voters.
Nonnegative Spectral Clustering with Discriminative Regularization
Yang, Yi (The University of Queensland) | Shen, Heng Tao (The University of Queensland) | Nie, Feiping (University of Texas at Arlington) | Ji, Rongrong (Columbia University) | Zhou, Xiaofang (The University of Queensland)
Clustering is a fundamental research topic in the field of data mining. Optimizing the objective functions of clustering algorithms, e.g. normalized cut and k-means, is an NP-hard optimization problem. Existing algorithms usually relax the elements of cluster indicator matrix from discrete values to continuous ones. Eigenvalue decomposition is then performed to obtain a relaxed continuous solution, which must be discretized. The main problem is that the signs of the relaxed continuous solution are mixed. Such results may deviate severely from the true solution, making it a nontrivial task to get the cluster labels. To address the problem, we impose an explicit nonnegative constraint for a more accurate solution during the relaxation. Besides, we additionally introduce a discriminative regularization into the objective to avoid overfitting. A new iterative approach is proposed to optimize the objective. We show that the algorithm is a general one which naturally leads to other extensions. Experiments demonstrate the effectiveness of our algorithm.
Efficiently Learning a Distance Metric for Large Margin Nearest Neighbor Classification
Park, Kyoungup (The Australian National University and NICTA) | Shen, Chunhua (University of Adelaide and NICTA) | Hao, Zhihui (Beijing Institute of Technology) | Kim, Junae (The Australian National University and NICTA)
We concern the problem of learning a Mahalanobis distance metric for improving nearest neighbor classification. Our work is built upon the large margin nearest neighbor (LMNN) classification framework. Due to the semidefiniteness constraint in the optimization problem of LMNN, it is not scalable in terms of the dimensionality of the input data. The original LMNN solver partially alleviates this problem by adopting alternating projection methods instead of standard interior-point methods. Still, at each iteration, the computation complexity is at least O(D 3 ) (D is the dimension of input data). In this work, we propose a column generation based algorithm to solve the LMNN optimization problem much more efficiently. Our algorithm is much more scalable in tha tat each iteration, it does not need full eigen-decomposition. Instead, we only need to find the leading eigen value and its corresponding eigen vector, which is of O(D 2 ) complexity. Experiments show the efficiency and efficacy of our algorithms.
Mean Field Inference in Dependency Networks: An Empirical Study
Lowd, Daniel (University of Oregon) | Shamaei, Arash (University of Oregon)
Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variable given its Markov blanket. Running Gibbs sampling with these conditional distributions produces a joint distribution that can be used to answer queries, but suffers from the traditional slowness of sampling-based inference. In this paper, we observe that the mean field update equation can be applied to dependency networks, even though the conditional probability distributions may be inconsistent with each other. In experiments with learning and inference on 12 datasets, we demonstrate that mean field inference in dependency networks offers similar accuracy to Gibbs sampling but with orders of magnitude improvements in speed. Compared to Bayesian networks learned on the same data, dependency networks offer higher accuracy at greater amounts of evidence. Furthermore, mean field inference is consistently more accurate in dependency networks than in Bayesian networks learned on the same data.
Unsupervised Learning of Human Behaviours
Chua, Sook-Ling (Massey University) | Marsland, Stephen (Massey University) | Guesgen, Hans W. (Massey University)
Behaviour recognition is the process of inferring the behaviour of an individual from a series of observations acquired from sensors such as in a smart home. The majority of existing behaviour recognition systems are based on supervised learning algorithms, which means that training them requires a preprocessed, annotated dataset. Unfortunately, annotating a dataset is a rather tedious process and one that is prone to error. In this paper we suggest a way to identify structure in the data based on text compression and the edit distance between words, without any prior labelling. We demonstrate that by using this method we can automatically identify patterns and segment the data into patterns that correspond to human behaviours. To evaluate the effectiveness of our proposed method, we use a dataset from a smart home and compare the labels produced by our approach with the labels assigned by a human to the activities in the dataset. We find that the results are promising and show significant improvement in the recognition accuracy over Self-Organising Maps (SOMs).
Bounded Forgetting
Zhou, Yi (University of Western Sydney) | Zhang, Yan (University of Western Sydney)
The result of forgetting some predicates in a first-order sentence may not exist in the sense that it might not be captured by any first-order sentences. This, indeed, severely restricts the usage of forgetting in applications. To address this issue, we propose a notion called $k$-forgetting, also called bounded forgetting in general, for any fixed number $k$. We present several equivalent characterizations of bounded forgetting and show that the result of bounded forgetting, on one hand, can always be captured by a single first-order sentence, and on the other hand, preserves the information that we are concerned with.
Language Splitting and Relevance-Based Belief Change in Horn Logic
Wu, Maonia (Guizhou University) | Zhang, Dongmo (University of Western Sydney) | Zhang, Mingyi (Guizhou Academy of Sciences)
This paper presents a framework for relevance-based belief change in propositional Horn logic. We firstly establish a parallel interpolation theorem for Horn logic and show that Parikh's Finest Splitting Theorem holds with Horn formulae. By reformulating Parikh's relevance criterion in the setting of Horn belief change, we construct a relevance-based partial meet Horn contraction operator and provide a representation theorem for the operator. Interestingly, we find that this contraction operator can be fully characterised by Delgrande and Wassermann's postulates for partial meet Horn contraction as well as Parikh's relevance postulate without requiring any change on the postulates, which is qualitatively different from the case in classical propositional logic.
Tracking User-Preference Varying Speed in Collaborative Filtering
Li, Ruijiang (Fudan University) | Li, Bin (University of Technology, Sydney) | Jin, Cheng (Fudan University) | Xue, Xiangyang (Fudan University) | Zhu, Xingquan (University of Technology, Sydney)
In real-world recommender systems, some users are easily influenced by new products and whereas others are unwilling to change their minds. So the preference varying speeds for users are different. Based on this observation, we propose a dynamic nonlinear matrix factorization model for collaborative filtering, aimed to improve the rating prediction performance as well as track the preference varying speeds for different users. We assume that user-preference changes smoothly over time, and the preference varying speeds for users are different. These two assumptions are incorporated into the proposed model as prior knowledge on user feature vectors, which can be learned efficiently by MAP estimation. The experimental results show that our method not only achieves state-of-the-art performance in the rating prediction task, but also provides an effective way to track user-preference varying speed.
A Comparison of Lex Bounds for Multiset Variables in Constraint Programming
Law, Yat Chiu (The Chinese University of Hong Kong) | Lee, Jimmy Ho Man (The Chinese University of Hong Kong) | Woo, May Hiu Chun (The Chinese University of Hong Kong) | Walsh, Toby (NICTA and the University of New South Wales)
Set and multiset variables in constraint programming have typically been represented using subset bounds. However, this is a weak representation that neglects potentially useful information about a set such as its cardinality. For set variables, the length-lex (LL) representation successfully provides information about the length (cardinality) and position in the lexicographic ordering. For multiset variables, where elements can be repeated, we consider richer representations that take into account additional information. We study eight different representations in which we maintain bounds according to one of the eight different orderings: length-(co)lex (LL/LC), variety-(co)lex (VL/VC), length-variety-(co)lex (LVL/LVC), and variety-length-(co)lex (VLL/VLC) orderings. These representations integrate together information about the cardinality, variety (number of distinct elements in the multiset), and position in some total ordering. Theoretical and empirical comparisons of expressiveness and compactness of the eight representations suggest that length-variety-(co)lex (LVL/LVC) and variety-length-(co)lex (VLL/VLC) usually give tighter bounds after constraint propagation. We implement the eight representations and evaluate them against the subset bounds representation with cardinality and variety reasoning. Results demonstrate that they offer significantly better pruning and runtime.