Overview
Algorithm Runtime Prediction: Methods and Evaluation (Extended Abstract)
Hutter, Frank (University of Freiburg) | Xu, Lin (University of British Columbia) | Hoos, Holger (University of British Columbia) | Leyton-Brown, Kevin (University of British Columbia)
Perhaps surprisingly, it is possible to predict how long an algorithm will take to run on a previously unseen input, using machine learning techniques to build a model of the algorithm's runtime as a function of problem-specific instance features. Such models have many important applications and over the past decade, a wide variety of techniques have been studied for building such models. In this extended abstract of our 2014 AI Journal article of the same title, we summarize existing models and describe new model families and various extensions. In a comprehensive empirical analyis using 11 algorithms and 35 instance distributions spanning a wide range of hard combinatorial problems, we demonstrate that our new models yield substantially better runtime predictions than previous approaches in terms of their generalization to new problem instances, to new algorithms from a parameterized space, and to both simultaneously.
Mobile Query Recommendation via Tensor Function Learning
Zhao, Zhou (Zhejiang University) | Song, Ruihua (Microsoft Research, Beijing) | Xie, Xing (Microsoft Research, Beijing) | He, Xiaofei (Zhejiang University) | Zhuang, Yueting (Zhejiang University)
With the prevalence of mobile search nowadays, the benefits of mobile query recommendation are well recognized, which provide formulated queries sticking to usersโ search intent. In this paper, we introduce the problem of query recommendation on mobile devices and model the user-location-query relations with a tensor representation. Unlike previous studies based on tensor decomposition, we study this problem via tensor function learning. That is, we learn the tensor function from the side information of users, locations and queries, and then predict usersโ search intent. We develop an efficient alternating direction method of multipliers (ADMM) scheme to solve the introduced problem. We empirically evaluate our approach based on the mobile query dataset from Bing search engine in the city of Beijing, China, and show that our method can outperform several state-of-the-art approaches.
Bayesian Active Learning for Posterior Estimation
Kandasamy, Kirthevasan (Carnegie Mellon University) | Schneider, Jeff (Carnegie Mellon University) | Poczos, Barnabas (Carnegie Mellon University)
This paper studies active posterior estimation in a Bayesian setting when the likelihood is expensive to evaluate. Existing techniques for posterior estimation are based on generating samples representative of the posterior. Such methods do not consider efficiency in terms of likelihood evaluations. In order to be query efficient we treat posterior estimation in an active regression framework. ย We propose two myopic query strategies to choose where to evaluate the likelihood and implement them using Gaussian processes. Via experiments on a series of synthetic and real examples we demonstrate that our approach is significantly more query efficient than existing techniques and other heuristics for posterior estimation.
Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning
Geng, Xin (Southeast University) | Hou, Peng (Southeast University)
This paper studies an interesting problem: is it possible to predict the crowd opinion about a movie before the movie is actually released? The crowd opinion is here expressed by the distribution of ratings given by a sufficient amount of people. Consequently, the pre-release crowd opinion prediction can be regarded as a Label Distribution Learning (LDL) problem. In order to solve this problem, a Label Distribution Support Vector Regressor (LDSVR) is proposed in this paper. The basic idea of LDSVR is to fit a sigmoid function to each component of the label distribution simultaneously by a multi-output support vector machine. Experimental results show that LDSVR can accurately predict peoplesโs rating distribution about a movie just based on the pre-release metadata of the movie.
Probabilistic Belief Contraction Using Argumentation
Chhogyal, Kinzang (Griffith University and Macquarie Unversity) | Nayak, Abhaya (Macquarie Univeristy) | Zhuang, Zhiqiang (Griffith University) | Sattar, Abdul (Griffith Unversity)
When a belief state is represented as a probability function P, the resulting belief state of the contraction of a sentence (belief) from the original belief state P can be given by the probabilistic version of the Harper Identity. Specifically, the result of contracting P by a sentence h is taken to be the mixture of two states: the original state P, and the resultant state P* ~h of revising P by the negation of h. What proportion of P and P* ~h should be used in this mixture remains an open issue and is largely ignored in literature. In this paper, we first classify different belief states by their stability, and then exploit the quantitative nature of probabilities and combine it with the basic ideas of argumentation theory to determine ย the mixture proportions. We, therefore, propose a novel approach to probabilistic belief contraction using argumentation.
A Crowdfunding Model for Green Energy Investment
Zheng, Ronghuo (Carnegie Mellon University) | Xu, Ying (Carnegie Mellon University) | Chakraborty, Nilanjan (Stony Brook University) | Sycara, Katia (Carnegie Mellon University)
This paper studies a new renewable energy investment model through crowdfunding, which is motivated by emerging community solar farms. In this paper we develop a sequential game theory model to capture the interactions among crowdfunders, the solar farm owner, and an electricity company who purchases renewable energy generated by the solar farm in a multi-period framework. By characterizing a unique subgame-perfect equilibrium, andcomparing it with a benchmark model without crowdfunding, we find that under crowdfunding although the farm owner reduces its investment level, the overall green energy investment level is increased due to the contribution of crowdfunders. We also find that crowdfunding can increase the penetration of green energy in consumption and thus reduce the energy procurement cost of the electricity company. Finally, the numerical results based on real data indicates crowdfunding is a simple but effective way to boost green generation.
ฮฑ-min: A Compact Approximate Solver For Finite-Horizon POMDPs
Dujardin, Yann (Commonwealth Scientific and Industrial Research Organisation (CSIRO)) | Dietterich, Tom (Oregon State University) | Chades, Iadine (Commonwealth Scientific and Industrial Research Organisation (CSIRO))
In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers.ย One measure of simplicity for POMDP value functions is the number of alpha-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of alpha-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of alpha-vectors.ย Building on previous point-based POMDP solvers, this paper introduces a new algorithm (alpha-min) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of alpha-vectors. At each time-step, alpha-min calculates alpha-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few alpha-vectors . Experimental results show that alpha-min provides good approximate solutions given a fixed number of alpha-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.
Catch the Black Sheep: Unified Framework for Shilling Attack Detection Based on Fraudulent Action Propagation
Zhang, Yongfeng (Tsinghua University) | Tan, Yunzhi (Tsinghua University) | Zhang, Min (Tsinghua University) | Liu, Yiqun (Tsinghua University) | Chua, Tat-Seng (National University of Singapore) | Ma, Shaoping (Tsinghua University)
Many e-commerce systems allow users to express their opinions towards products through user reviews systems. The user generated reviews not only help other users to gain a more insightful view of the products, but also help online businesses to make targeted improvements on the products or services. Besides, they compose the key component of various personalized recommender systems. However, the existence of spam user accounts in the review systems introduce unfavourable disturbances into personalized recommendation by promoting or degrading targeted items intentionally through fraudulent reviews. Previous shilling attack detection algorithms usually deal with a specific kind of attacking strategy, and are exhausted to handle with the continuously emerging new cheating methods. In this work, we propose to conduct shilling attack detection for more informed recommendation by fraudulent action propagation on the reviews themselves, without caring about the specific underlying cheating strategy, which allows us a unified and flexible framework to detect the spam users.
Exploiting k-Degree Locality to Improve Overlapping Community Detection
Zhang, Hongyi (The Chinese University of Hong Kong) | Lyu, Michael R. (The Chinese University of Hong Kong) | King, Irwin (The Chinese University of Hong Kong)
Community detection is of crucial importance in understanding structures of complex networks. In many real-world networks, communities naturally overlap since a node usually has multiple community memberships. One popular technique to cope with overlapping community detection is Matrix Factorization (MF). However, existing MF-based models have ignored the fact that besides neighbors, "local non-neighbors" (e.g., my friend's friend but not my direct friend) are helpful when discovering communities. In this paper, we propose a Locality-based Non-negative Matrix Factorization (LNMF) model to refine a preference-based model by incorporating locality into learning objective. We define a subgraph called "k-degree local network" to set a boundary between local non-neighbors and other non-neighbors. By discriminately treating these two class of non-neighbors, our model is able to capture the process of community formation. We propose a fast sampling strategy within the stochastic gradient descent based learning algorithm. We compare our LNMF model with several baseline methods on various real-world networks, including large ones with ground-truth communities. Results show that our model outperforms state-of-the-art approaches.
Prime Compilation of Non-Clausal Formulae
Previti, Alessandro (University College Dublin) | Ignatiev, Alexey (INESC-ID, IST) | Morgado, Antonio (INESC-ID, IST) | Marques-Silva, Joao (INESC-ID, IST and University College Dublin)
Formula compilation by generation of prime implicates or implicants finds a wide range of applications in AI. Recent work on formula compilation by prime implicate/implicant generation often assumes a Conjunctive/Disjunctive Normal Form (CNF/DNF) representation. However, in many settings propositional formulae are naturally expressed in non-clausal form. Despite a large body of work on compilation of non-clausal formulae, in practice existing approaches can only be applied to fairly small formulae, containing at most a few hundred variables. This paper describes two novel approaches for the compilation of non-clausal formulae either with prime implicants or implicates, that is based on propositional Satisfiability (SAT) solving. These novel algorithms also find application when computing all prime implicates of a CNF formula. The proposed approach is shown to allow the compilation of non-clausal formulae of size significantly larger than existing approaches.