Not enough data to create a plot.
Try a different view from the menu above.
Country
Slogans Are Not Forever: Adapting Linguistic Expressions to the News
Gatti, Lorenzo (FBK-IRST) | Özbal, Gözde (FBK-IRST) | Guerini, Marco (Trento RISE) | Stock, Oliviero (FBK-IRST) | Strapparava, Carlo (FBK-IRST)
Artistic creation is often based on the concept of blending. Linguistic creativity is no exception, as demonstrated for instance by the importance of metaphors in poetry. Blending can also be used to evoke a secondary concept while playing with an already given piece of language, either with the intention of making the secondary concept well perceivable to the reader, or instead, to subtly evoke something additional. Current language technology can do a lot in this connection, and automated language creativity can be useful in cases where input or target are to change continuously, making human production not feasible. In this work we present a system that takes existing well-known expressions and innovates them by bringing in a novel concept coming from evolving news. The technology is composed of several steps concerned with the selection of the sortable concepts and the production of novel expressions, largely relying on state of the art corpus-based methods. Proposed applications include: i) producing catchy news headlines by "parasitically" exploiting well known successful expressions and adapting them to the news at hand; ii) generating adaptive slogans that allude to news of the day and give life to the concept evoked by the slogan; iii) providing artists with an application for boosting their creativity.
Logic Program Termination Analysis Using Atom Sizes
Calautti, Marco (University of Calabria) | Greco, Sergio (University of Calabria) | Molinaro, Cristian (University of Calabria) | Trubitsyna, Irina (University of Calabria)
Recent years have witnessed a great deal of interest in extending answer set programming with function symbols. Since the evaluation of a program with function symbols might not terminate and checking termination is undecidable, several classes of logic programs have been proposed where the use of function symbols is limited but the program evaluation is guaranteed to terminate. In this paper, we propose a novel class of logic programs whose evaluation always terminates. The proposed technique identifies terminating programs that are not captured by any of the current approaches. Our technique is based on the idea of measuring the size of terms and atoms to check whether the rule head size is bounded by the body, and performs a more fine-grained analysis than previous work. Rather than adopting an all-or-nothing approach (either we can say that the program is terminating or we cannot say anything), our technique can identify arguments that are "limited'' (i.e., where there is no infinite propagation of terms) even when the program is not entirely recognized as terminating. Identifying arguments that are limited can support the user in the problem formulation and help other techniques that use limited arguments as a starting point. Another useful feature of our approach is that it is able to leverage external information about limited arguments. We also provide results on the correctness, the complexity, and the expressivity of our technique.
Robust Dictionary Learning with Capped l1-Norm
Jiang, Wenhao (University of Texas at Arlington) | Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington)
Expressing data vectors as sparse linear combinations of basis elements (dictionary) is widely used in machine learning, signal processing, and statistics. It has been found that dictionaries learned from data are more effective than off-the-shelf ones. Dictionary learning has become an important tool for computer vision. Traditional dictionary learning methods use quadratic loss function which is known sensitive to outliers. Hence they could not learn the good dictionaries when outliers exist. In this paper, aiming at learning dictionaries resistant to outliers, we proposed capped l1-norm based dictionary learning and an efficient iterative re-weighted algorithm to solve the problem. We provided theoretical analysis and carried out extensive experiments on real word datasets and synthetic datasets to show the effectiveness of our method.
Multi-Task Multi-View Clustering for Non-Negative Data
Zhang, Xianchao (Dalian University of Technology) | Zhang, Xiaotong (Dalian University of Technology) | Liu, Han (Dalian University of Technology)
Multi-task clustering and multi-view clustering have severally found wide applications and received much attention in recent years. Nevertheless, there are many clustering problems that involve both multi-task clustering and multi-view clustering, i.e., the tasks are closely related and each task can be analyzed from multiple views. In this paper, for non-negative data (e.g., documents), we introduce a multi-task multi-view clustering (MTMVC) framework which integrates within-view-task clustering, multi-view relationship learning and multi-task relationship learning. We then propose a specific algorithm to optimize the MTMVC framework. Experimental results show the superiority of the proposed algorithm over either multi-task clustering algorithms or multi-view clustering algorithms for multi-task clustering of multi-view data.
Modeling Multi-Attribute Demand for Sustainable Cloud Computing with Copulae
Ghasemi, Maryam (Boston University) | Lubin, Benjamin (Boston University)
As cloud computing gains in popularity, understanding the patterns and structure of its loads is increasingly important in order to drive effective resource allocation, scheduling and pricing decisions. These efficiency increases are then associated with a reduction in the data center environmental footprint. Existing models have only treated a single resource type, such as CPU, or memory, at a time. We offer a sophisticated machine learning approach to capture the joint-distribution. We capture the relationship among multiple resources by carefully fitting both the marginal distributions of each resource type as well as the non-linear structure of their correlation via a copula distribution. We investigate several choices for both models by studying a public data set of Google data-center usage. We show the Burr XII distribution to be a particularly effective choice for modeling the marginals and the Frank copula to be the best choice for stitching these together into a joint distribution. Our approach offers a significant fidelity improvement and generalizes directly to higher dimensions. In use, this improvement will translate directly to reductions in energy consumption.
Determining Expert Research Areas with Multi-Instance Learning of Hierarchical Multi-Label Classification Model
Wu, Tao (Purdue University) | Wang, Qifan (Purdue University) | Zhang, Zhiwei (Purdue University) | Si, Luo (Purdue University)
Automatically identifying the research areas of academic/industry researchers is an important task for building expertise organizations or search systems. In general, this task can be viewed as text classification that generates a set of research areas given the expertise of a researcher like documents of publications. However, this task is challenging because the evidence of a research area may only exist in a few documents instead of all documents. Moreover, the research areas are often organized in a hierarchy, which limits the effectiveness of existing text categorization methods. This paper proposes a novel approach, Multi-instance Learning of Hierarchical Multi-label Classification Model (MIHML) for the task, which effectively identifies multiple research areas in a hierarchy from individual documents within the profile of a researcher. An Expectation-Maximization (EM) optimization algorithm is designed to learn the model parameters. Extensive experiments have been conducted to demonstrate the superior performance of proposed research with a real world application.
Dissecting German Grammar and Swiss Passports: Open-Domain Decomposition of Compositional Entries in Large-Scale Knowledge Repositories
Pasca, Marius (Google Inc.) | Buisman, Hylke (Google Inc.)
This paper presents a weakly supervised method that decomposes potentially compositional topics (Swiss passport) into zero or more constituent topics (Switzerland, Passport), where all topics are entries in a knowledge repository. The method increases the connectivity of the knowledge repository and, more importantly, identifies the constituent topics whose meaning can be later aggregated into the meaning of the compositional topics. By exploiting evidence within Wikipedia articles, the method acquires constituent topics of Freebase topics at precision and recall above 0.60, over multiple human-annotated evaluation sets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Combining Existential Rules and Transitivity: Next Steps
Baget, Jean-François (Inria, CNRS, and University of Montpellier) | Bienvenu, Meghyn (CNRS and Université Paris-Sud) | Mugnier, Marie-Laure (University of Montpellier, Inria, and CNRS) | Rocher, Swan (University of Montpellier, Inria, and CNRS)
We consider existential rules (aka Datalog +/-) as a formalism for specifying ontologies. In recent years, many classes of existential rules have been exhibited for which conjunctive query (CQ) entailment is decidable. However, most of these classes cannot express transitivity of binary relations, a frequently used modelling construct. In this paper, we address the issue of whether transitivity can be safely combined with decidable classes of existential rules. First, we prove that transitivity is incompatible with one of the simplest decidable classes, namely aGRD (acyclic graph of rule dependencies), which clarifies the landscape of ‘finite expansion sets’ of rules. Second, we show that transitivity can be safely added to linear rules (a subclass of guarded rules, which generalizes the description logic DL-LiteR) in the case of atomic CQs, and also for general CQs if we place a minor syntactic restriction on the rule set. This is shown by means of a novel query rewriting algorithm that is specially tailored to handle transitivity rules. Third, for the identified decidable cases, we pinpoint the combined and data complexities of query entailment.
Feature Selection from Microarray Data via an Ordered Search with Projected Margin
Villela, Saulo Moraes (Federal University of Juiz de Fora) | Leite, Saul de Castro (Federal University of Juiz de Fora) | Neto, Raul Fonseca (Federal University of Juiz de Fora)
Microarray experiments are capable of measuring the expression level of thousands of genes simultaneously. Dealing with this enormous amount of information requires complex computation. Support Vector Machines (SVM) have been widely used with great efficiency to solve classification problems that have high dimension. In this sense, it is plausible to develop new feature selection strategies for microarray data that are associated with this type of classifier. Therefore, we propose, in this paper, a new method for feature selection based on an ordered search process to explore the space of possible subsets. The algorithm, called Admissible Ordered Search (AOS), uses as evaluation function the margin values estimated for each hypothesis by a SVM classifier. An important theoretical contribution of this paper is the development of the projected margin concept. This value is computed as the margin vector projection on a lower dimensional subspace and is used as an upper bound for the current value of the hypothesis in the search process. This enables great economy in runtime and consequently efficiency in the search process as a whole. The algorithm was tested using five different microarray data sets yielding superior results when compared to three representative feature selection methods.