Overview
What's Hot in Case-Based Reasoning
Goel, Ashok (Georgia Institute of Technology) | Diaz-Agudo, Belen (Complutense University )
Case-based reasoning addresses new problems by remembering and adapting solutions previously used to solve similar problems. Pulled by the increasing number of applications and pushed by a growing interest in memory intensive techniques, research on case-based reasoning appears to be gaining momentum. In this article, we briefly summarize recent developments in research on case-based reasoning based partly on the recent Twenty Fourth International Conference on Case-Based Reasoning.
SAT Competition 2016: Recent Developments
Balyo, Tomas ( Karlsruhe Institute of Technology Karlsruhe, Germany ) | Heule, Marijn J. H. (The University of Texas at Austin) | Jarvisalo, Matti (HIIT, Department of Computer Science University of Helsinki, Finland)
We give an overview of SAT Competition 2016, the 2016 edition of thefamous competition for Boolean satisfiability (SAT) solvers with over 20 years of history. A key aim is to point out ``what's hot'' in SAT competitions in 2016, i.e., new developments in thecompetition series, including new competition tracks and new solver techniquesimplemented in some of the award-winning solvers.
Matrix Factorisation for Scalable Energy Breakdown
Batra, Nipun (IIIT Delhi) | Wang, Hongning (University of Virginia) | Singh, Amarjeet (IIIT Delhi) | Whitehouse, Kamin (University of Virginia)
Homes constitute more than one-thirds of the total energy consumption. Producing an energy breakdown for a home has been shown to reduce household energy consumption by up to 15%, among other benefits. However, existing approaches to produce an energy breakdown require hardware to be installed in each home and are thus prohibitively expensive. In this paper, we propose a novel application of feature-based matrix factorisation that does not require any additional hard- ware installation. The basic premise of our approach is that common design and construction patterns for homes create a repeating structure in their energy data. Thus, a sparse basis can be used to represent energy data from a broad range of homes. We evaluate our approach on 516 homes from a publicly available data set and find it to be more effective than five baseline approaches that either require sensing in each home, or a very rigorous survey across a large number of homes coupled with complex modelling. We also present a deployment of our system as a live web application that can potentially provide energy breakdown to millions of homes.
Distant Domain Transfer Learning
Tan, Ben (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Nanyang Technological University) | Yang, Qiang (Hong Kong University of Science and Technology)
In this paper, we study a novel transfer learning problem termed Distant Domain Transfer Learning (DDTL). Different from existing transfer learning problems which assume that there is a close relation between the source domain and the target domain, in the DDTL problem, the target domain can be totally different from the source domain. For example, the source domain classifies face images but the target domain distinguishes plane images. Inspired by the cognitive processof human where two seemingly unrelated concepts can be connected by learning intermediate concepts gradually, we propose a Selective Learning Algorithm (SLA) to solve the DDTL problem with supervised autoencoder or supervised convolutional autoencoder as a base model for handling different types of inputs. Intuitively, the SLA algorithm selects usefully unlabeled data gradually from intermediate domains as a bridge to break the large distribution gap for transferring knowledge between two distant domains. Empirical studies on image classification problems demonstrate the effectiveness of the proposed algorithm, and on some tasks the improvement in terms of the classification accuracy is up to 17% over โnon-transferโ methods.
Top-k Hierarchical Classification
This paper studies a top-k hierarchical classification problem. In top-k classification, one is allowed to make k predictions and no penalty is incurred if at least one of k predictions is correct. In hierarchical classification, classes form a structured hierarchy, and misclassification costs depend on the relation between the correct class and the incorrect class in the hierarchy. Despite that the fact that both top-k classification and hierarchical classification have gained increasing interests, the two problems have always been studied separately. In this paper, we define a top-k hierarchical loss function using a real world application. We provide the Bayes-optimal solution that minimizes the expected top-k hierarchical misclassification cost. Via numerical experiments, we show that our solution outperforms two baseline methods that address only one of the two issues.
Multidimensional Scaling on Multiple Input Distance Matrices
Bai, Song (Huazhong University of Science and Technology) | Bai, Xiang ( Huazhong University of Science and Technology ) | Latecki, Longin Jan ( Temple University ) | Tian, Qi ( University of Texas at San Antonio )
Multidimensional Scaling (MDS) is a classic technique that seeks vectorial representations for data points, given the pairwise distances between them. In recent years, data are usually collected from diverse sources or have multiple heterogeneous representations. However, how to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge. In this paper, we first define this new task formally. Then, we propose a new algorithm called Multi-View Multidimensional Scaling (MVMDS) by considering each input distance matrix as one view. The proposed algorithm can learn the weights of views (i.e., distance matrices) automatically by exploring the consensus information and complementary nature of views. Experimental results on synthetic as well as real datasets demonstrate the effectiveness of MVMDS. We hope that our work encourages a wider consideration in many domains where MDS is needed.
LPMLN, Weak Constraints, and P-log
Lee, Joohyung (Arizona State University) | Yang, Zhun (Arizona State University)
LP MLN is a recently introduced formalism that extends answer set programs by adopting the log-linear weight scheme of Markov Logic. This paper investigates the relationships between LPMLN and two other extensions of answer set programs: weak constraints to express a quantitative preference among answer sets, and P-log to incorporate probabilistic uncertainty. We present a translation of LP MLN into programs with weak constraints and a translation of P-log into LPMLN, which complement the existing translations in the opposite directions. The first translation allows us to compute the most probable stable models (i.e., MAP estimates) of LP MLN programs using standard ASP solvers. This result can be extended to other formalisms, such as Markov Logic, ProbLog, and Pearl's Causal Models, that are shown to be translatable into LP MLN . The second translation tells us how probabilistic nonmonotonicity (the ability of the reasoner to change his probabilistic model as a result of new information) of P-log can be represented in LP MLN , which yields a way to compute P-log using standard ASP solvers and MLN solvers.
Faster and Simpler Algorithm for Optimal Strategies of Blotto Game
Behnezhad, Soheil (University of Maryland) | Dehghani, Sina (University of Maryland) | Derakhshan, Mahsa (University of Maryland) | Hajiaghayi, MohammadTaghi (University of Maryland) | Seddighin, Saeed (University of Maryland)
In the Colonel Blotto game, which was initially introduced by Borel in 1921, two colonels simultaneously distribute their troops across different battle๏ฌelds.The winner of each battle๏ฌeld is determined independently by a winner-take-all rule. The ultimate payoff of each colonel is the number of battle๏ฌelds he wins. This game is commonly used for analyzing a wide range of applications such as the U.S presidential election, innovative technology competitions, advertisements, etc. There have been persistent efforts for ๏ฌnding the optimal strategies for the Colonel Blotto game. After almost a century Ahmadinejad, Dehghani, Hajiaghayi, Lucier, Mahini, and Seddighin provided a poly-time algorithm for ๏ฌnding the optimal strategies. They ๏ฌrst model the problem by a Linear Program (LP) with exponential number of constraints and use Ellipsoid method to solve it. However, despite the theoretical importance of their algorithm, it ishighly impractical. In general, even Simplex method (despite its exponential running-time) performs better than Ellipsoid method in practice. In this paper, we provide the ๏ฌrst polynomial-size LP formulation of the optimal strategies for the Colonel Blotto game. We use linear extension techniques. Roughly speaking, we project the strategy space polytope to a higher dimensional space, which results in a lower number of facets for the polytope.We use this polynomial-size LP to provide a novel, simpler and signi๏ฌcantly faster algorithm for ๏ฌnding the optimal strategies for the Colonel Blotto game. We further show this representation is asymptotically tight in terms of the number of constraints. We also extend our approach to multi-dimensional Colonel Blotto games, and implement our algorithm to observe interesting properties of Colonel Blotto; for example, we observe the behavior of players in the discrete model is very similar to the previously studied continuous model.
Expectile Matrix Factorization for Skewed Data Analysis
Zhu, Rui (University of Alberta) | Niu, Di (University of Alberta) | Kong, Linglong (University of Alberta ) | Li, Zongpeng (University of Calgary)
Matrix factorization is a popular approach to solving matrix estimation problems based on partial observations. Existing matrix factorization is based on least squares and aims to yield a low-rank matrix to interpret the conditional sample means given the observations. However, in many real applications with skewed and extreme data, least squares cannot explain their central tendency or tail distributions, yielding undesired estimates. In this paper, we propose expectile matrix factorization by introducing asymmetric least squares, a key concept in expectile regression analysis, into the matrix factorization framework. We propose an efficient algorithm to solve the new problem based on alternating minimization and quadratic programming. We prove that our algorithm converges to a global optimum and exactly recovers the true underlying low-rank matrices when noise is zero. For synthetic data with skewed noise and a real-world dataset containing web service response times, the proposed scheme achieves lower recovery errors than the existing matrix factorization method based on least squares in a wide range of settings.
Web-Based Semantic Fragment Discovery for On-Line Lingual-Visual Similarity
Sun, Xiaoshuai (The University of Queensland and Harbin Institute of Technology) | Cao, Jiewei (The University of Queensland) | Li, Chao (The University of Queensland) | Zhu, Lei (The University of Queensland) | Shen, Heng Tao (The University of Queensland and University of Electronic Science and Technology of China)
In this paper, we present an automatic approach for on-line discovery of visual-lingual semantic fragments from weakly labeled Internet images. Instead of learning region-entity correspondences from well-labeled image-sentence pairs, our approach directly collects and enhances the weakly labeled visual contents from the Web and constructs an adaptive visual representation which automatically links generic lingual phrases to their related visual contents. To ensure reliable and efficient semantic discovery, we adopt non-parametric density estimation to re-rank the related visual instances and proposed a fast self-similarity-based quality assessment method to identify the high-quality semantic fragments. The discovered semantic fragments provide an adaptive joint representation for texts and images, based on which lingual-visual similarity can be defined for further co-analysis of heterogeneous multimedia data. Experimental results on semantic fragment quality assessment, sentence-based image retrieval, automatic multimedia insertion and ordering demonstrated the effectiveness of the proposed framework.The experiments show that the proposed methods can make effective use of the Web knowledge, and are able to generate competitive results compared to state-of-the-art approaches in various tasks.