Genre
The Kernel Pitman-Yor Process
Chatzis, Sotirios P., Korkinof, Dimitrios, Demiris, Yiannis
Nonparametric Bayesian modeling techniques, especially Dirichlet process mixture (DPM) models, have become very popular in statistics over the last few years, for performing nonparametric density estimation [1], [2], [3]. This theory is based on the observation that an infinite number of component distributions in an ordinary finite mixture model (clustering model) tends on the limit to a Dirichlet process (DP) prior [2], [4]. Eventually, the nonparametric Bayesian inference scheme induced by a DPM model yields a posterior distribution on the proper number of model component densities (inferred clusters) [5], rather than selecting a fixed number of mixture components. Hence, the obtained nonparametric Bayesian formulation eliminates the need of doing inference (or making arbitrary choices) on the number of mixture components (clusters) necessary to represent the modeled data. An interesting alternative to the Dirichlet process prior for nonparametric Bayesian modeling is the Pitman-Yor process (PYP) prior [6]. Pitman-Yor processes produce power-law distributions that allow for better modeling populations comprising a high number of clusters with low popularity and a low number of clusters with high popularity [7]. Indeed, the Pitman-Yor process prior can be viewed as a generalization of the Dirichlet process prior, and reduces to it for a specific selection of its parameter values. In [8], a Gaussian process-based coupled PYP method for joint segmentation of multiple images is proposed.
Relational Theories with Null Values and Non-Herbrand Stable Models
Lifschitz, Vladimir, Pichotta, Karl, Yang, Fangkai
Generalized relational theories with null values in the sense of Reiter are first-order theories that provide a semantics for relational databases with incomplete information. In this paper we show that any such theory can be turned into an equivalent logic program, so that models of the theory can be generated using computational methods of answer set programming. As a step towards this goal, we develop a general method for calculating stable models under the domain closure assumption but without the unique name assumption.
Local Optima Networks, Landscape Autocorrelation and Heuristic Search Performance
Chicano, Francisco, Daolio, Fabio, Ochoa, Gabriela, Verel, Sรฉbastien, Tomassini, Marco, Alba, Enrique
Recent developments in fitness landscape analysis include the study of Local Optima Networks (LON) and applications of the Elementary Landscapes theory. This paper represents a first step at combining these two tools to explore their ability to forecast the performance of search algorithms. We base our analysis on the Quadratic Assignment Problem (QAP) and conduct a large statistical study over 600 generated instances of different types. Our results reveal interesting links between the network measures, the autocorrelation measures and the performance of heuristic search algorithms.
Local optima networks and the performance of iterated local search
Daolio, Fabio, Verel, Sรฉbastien, Ochoa, Gabriela, Tomassini, Marco
Local Optima Networks (LONs) have been recently proposed as an alternative model of combinatorial fitness landscapes. The model compresses the information given by the whole search space into a smaller mathematical object that is the graph having as vertices the local optima and as edges the possible weighted transitions between them. A new set of metrics can be derived from this model that capture the distribution and connectivity of the local optima in the underlying configuration space. This paper departs from the descriptive analysis of local optima networks, and actively studies the correlation between network features and the performance of a local search heuristic. The NK family of landscapes and the Iterated Local Search metaheuristic are considered. With a statistically-sound approach based on multiple linear regression, it is shown that some LONs' features strongly influence and can even partly predict the performance of a heuristic search algorithm. This study validates the expressive power of LONs as a model of combinatorial fitness landscapes.
Introduction to the 28th International Conference on Logic Programming Special Issue
Dovier, Agostino, Costa, Vรญtor Santos
We are proud to introduce this special issue of the Journal of Theory and Practice of Logic Programming (TPLP), dedicated to the full papers accepted for the 28th International Conference on Logic Programming (ICLP). The ICLP meetings started in Marseille in 1982 and since then constitute the main venue for presenting and discussing work in the area of logic programming. We contributed to ICLP for the first time in 1991. The first guest-editor had a paper on logic programming with sets, and the second had two papers on the parallel implementation of the Andorra model. Since then, we continued pursuing research in this exciting area and ICLP has always been the major venue for our work.
Opinion Mining for Relating Subjective Expressions and Annual Earnings in US Financial Statements
Chen, Chien-Liang, Liu, Chao-Lin, Chang, Yuan-Chen, Tsai, Hsiang-Ping
Financial statements contain quantitative information and manager's subjective evaluation of firm's financial status. Using information released in U.S. 10-K filings. Both qualitative and quantitative appraisals are crucial for quality financial decisions. To extract such opinioned statements from the reports, we built tagging models based on the conditional random field (CRF) techniques, considering a variety of combinations of linguistic factors including morphology, orthography, predicate-argument structure, syntax, and simple semantics. Our results show that the CRF models are reasonably effective to find opinion holders in experiments when we adopted the popular MPQA corpus for training and testing. The contribution of our paper is to identify opinion patterns in multiword expressions (MWEs) forms rather than in single word forms. We find that the managers of corporations attempt to use more optimistic words to obfuscate negative financial performance and to accentuate the positive financial performance. Our results also show that decreasing earnings were often accompanied by ambiguous and mild statements in the reporting year and that increasing earnings were stated in assertive and positive way.
Quick Summary
Quick Summary is an innovate implementation of an automatic document summarizer that inputs a document in the English language and evaluates each sentence. The scanner or evaluator determines criteria based on its grammatical structure and place in the paragraph. The program then asks the user to specify the number of sentences the person wishes to highlight. For example should the user ask to have three of the most important sentences, it would highlight the first and most important sentence in green. Commonly this is the sentence containing the conclusion. Then Quick Summary finds the second most important sentence usually called a satellite and highlights it in yellow. This is usually the topic sentence. Then the program finds the third most important sentence and highlights it in red. The implementations of this technology are useful in a society of information overload when a person typically receives 42 emails a day (Microsoft). The paper also is a candid look at difficulty that machine learning has in textural translating. However, it speaks on how to overcome the obstacles that historically prevented progress. This paper proposes mathematical meta-data criteria that justify the place of importance of a sentence. Just as tools for the study of relational symmetry in bio-informatics, this tool seeks to classify words with greater clarity. "Survey Finds Workers Average Only Three Productive Days per Week." Microsoft News Center. Microsoft. Web. 31 Mar. 2012.
Inferring the Underlying Structure of Information Cascades
Zong, Bo, Wu, Yinghui, Singh, Ambuj K., Yan, Xifeng
In social networks, information and influence diffuse among users as cascades. While the importance of studying cascades has been recognized in various applications, it is difficult to observe the complete structure of cascades in practice. Moreover, much less is known on how to infer cascades based on partial observations. In this paper we study the cascade inference problem following the independent cascade model, and provide a full treatment from complexity to algorithms: (a) We propose the idea of consistent trees as the inferred structures for cascades; these trees connect source nodes and observed nodes with paths satisfying the constraints from the observed temporal information. (b) We introduce metrics to measure the likelihood of consistent trees as inferred cascades, as well as several optimization problems for finding them. (c) We show that the decision problems for consistent trees are in general NP-complete, and that the optimization problems are hard to approximate. (d) We provide approximation algorithms with performance guarantees on the quality of the inferred cascades, as well as heuristics. We experimentally verify the efficiency and effectiveness of our inference algorithms, using real and synthetic data.
Distributed Problem Solving
Yeoh, William (Singapore Management University) | Yokoo, Makoto (Kyushu University)
Distributed problem solving is a subfield within multiagent systems, where agents are assumed to be part of a team and collaborate with each other to reach a common goal. In this article, we illustrate the motivations for distributed problem solving and provide an overview of two distributed problem solving models, namely distributed constraint satisfaction problems (DCSPs) and distributed constraint optimization problems (DCOPs), and some of their algorithms.