Genre
Learning a Common Substructure of Multiple Graphical Gaussian Models
Hara, Satoshi, Washio, Takashi
Properties of data are frequently seen to vary depending on the sampled situations, which usually changes along a time evolution or owing to environmental effects. One way to analyze such data is to find invariances, or representative features kept constant over changes. The aim of this paper is to identify one such feature, namely interactions or dependencies among variables that are common across multiple datasets collected under different conditions. To that end, we propose a common substructure learning (CSSL) framework based on a graphical Gaussian model. We further present a simple learning algorithm based on the Dual Augmented Lagrangian and the Alternating Direction Method of Multipliers. We confirm the performance of CSSL over other existing techniques in finding unchanging dependency structures in multiple datasets through numerical simulations on synthetic data and through a real world application to anomaly detection in automobile sensors.
Condition for neighborhoods induced by a covering to be equal to the covering itself
It is a meaningful issue that under what condition neighborhoods induced by a covering are equal to the covering itself. A necessary and sufficient condition for this issue has been provided by some scholars. In this paper, through a counter-example, we firstly point out the necessary and sufficient condition is false. Second, we present a necessary and sufficient condition for this issue. Third, we concentrate on the inverse issue of computing neighborhoods by a covering, namely giving an arbitrary covering, whether or not there exists another covering such that the neighborhoods induced by it is just the former covering. We present a necessary and sufficient condition for this issue as well. In a word, through the study on the two fundamental issues induced by neighborhoods, we have gained a deeper understanding of the relationship between neighborhoods and the covering which induce the neighborhoods.
Rough sets and matroidal contraction
Rough sets are efficient for data pre-processing in data mining. As a generalization of the linear independence in vector spaces, matroids provide well-established platforms for greedy algorithms. In this paper, we apply rough sets to matroids and study the contraction of the dual of the corresponding matroid. First, for an equivalence relation on a universe, a matroidal structure of the rough set is established through the lower approximation operator. Second, the dual of the matroid and its properties such as independent sets, bases and rank function are investigated. Finally, the relationships between the contraction of the dual matroid to the complement of a single point set and the contraction of the dual matroid to the complement of the equivalence class of this point are studied.
Condition for neighborhoods in covering based rough sets to form a partition
Neighborhood is an important concept in covering based rough sets. That under what condition neighborhoods form a partition is a meaningful issue induced by this concept. Many scholars have paid attention to this issue and presented some necessary and sufficient conditions. However, there exists one common trait among these conditions, that is they are established on the basis of all neighborhoods have been obtained. In this paper, we provide a necessary and sufficient condition directly based on the covering itself. First, we investigate the influence of that there are reducible elements in the covering on neighborhoods. Second, we propose the definition of uniform block and obtain a sufficient condition from it. Third, we propose the definitions of repeat degree and excluded number. By means of the two concepts, we obtain a necessary and sufficient condition for neighborhoods to form a partition. In a word, we have gained a deeper and more direct understanding of the essence over that neighborhoods form a partition.
Some characteristics of matroids through rough sets
At present, practical application and theoretical discussion of rough sets are two hot problems in computer science. The core concepts of rough set theory are upper and lower approximation operators based on equivalence relations. Matroid, as a branch of mathematics, is a structure that generalizes linear independence in vector spaces. Further, matroid theory borrows extensively from the terminology of linear algebra and graph theory. We can combine rough set theory with matroid theory through using rough sets to study some characteristics of matroids. In this paper, we apply rough sets to matroids through defining a family of sets which are constructed from the upper approximation operator with respect to an equivalence relation. First, we prove the family of sets satisfies the support set axioms of matroids, and then we obtain a matroid. We say the matroids induced by the equivalence relation and a type of matroid, namely support matroid, is induced. Second, through rough sets, some characteristics of matroids such as independent sets, support sets, bases, hyperplanes and closed sets are investigated.
Mining Social Data to Extract Intellectual Knowledge
Abstract-- Social data mining is an interesting phenomenon which colligates different sources of social data to extract information. This information can be used in relationship prediction, decision making, pattern recognition, social mapping, responsibility distribution and many other applications. This paper presents a systematical data mining architecture to mine intellectual knowledge from social data. In this research, we use social networking site facebook as primary data source. We collect different attributes such as about me, comments, wall post and age from facebook as raw data and use advanced data mining approaches to excavate intellectual knowledge. We also analyze our mined knowledge with comparison for possible usages like as human behavior prediction, pattern recognition, job responsibility distribution, decision making and product promoting.
On Move Pattern Trends in a Large Go Games Corpus
Baudiลก, Petr, Moudลรญk, Josef
We process a large corpus of game records of the board game of Go and propose a way of extracting summary information on played moves. We then apply several basic data-mining methods on the summary information to identify the most differentiating features within the summary information, and discuss their correspondence with traditional Go knowledge. We show statistically significant mappings of the features to player attributes such as playing strength or informally perceived "playing style" (e.g. territoriality or aggressivity), describe accurate classifiers for these attributes, and propose applications including seeding real-work ranks of internet players, aiding in Go study and tuning of Go-playing programs, or contribution to Go-theoretical discussion on the scope of "playing style".
A Bayesian Nonparametric Approach to Image Super-resolution
Polatkan, Gungor, Zhou, Mingyuan, Carin, Lawrence, Blei, David, Daubechies, Ingrid
Super-resolution methods form high-resolution images from low-resolution images. In this paper, we develop a new Bayesian nonparametric model for super-resolution. Our method uses a beta-Bernoulli process to learn a set of recurring visual patterns, called dictionary elements, from the data. Because it is nonparametric, the number of elements found is also determined from the data. We test the results on both benchmark and natural images, comparing with several other models from the research literature. We perform large-scale human evaluation experiments to assess the visual quality of the results. In a first implementation, we use Gibbs sampling to approximate the posterior. However, this algorithm is not feasible for large-scale data. To circumvent this, we then develop an online variational Bayes (VB) algorithm. This algorithm finds high quality dictionaries in a fraction of the time needed by the Gibbs sampler.
Nominal Association Vector and Matrix
Huang, Wenxue, Shi, Yong, Wang, Xiaogang
Nominal data are quite common in scientific and engineering research related to biomedical research, consumer behavior analysis, network analysis and search engine marketing optimization. When the population is cross-classified and there is no natural ordering for observed outcomes, association analysis as described in Han and Kamber (2006) can be described nominal association measures. Even if the categorical variables collected in these studies are ordinal, they are often treated as nominal if the ordering is not of interest or a natural and meaningful metric is difficult to establish. When the response variable is multinomial, the classical probabilistic measure such as odds ratio or relative risk are difficult to use due to the multiple 1 levels in the response variable. Instead, the principle of optimal (conditional mode based) or proportional (conditional Monte-Carlo based) prediction can be used to construct nonparametric nominal association measures. For example, Goodman-Kruskal (1954) and others proposed some local-to-global association measures towards optimal predictions. The proportional associations between variables are probabilistically and statistically intrinsic. It reflects the probabilistically averaging effects of input on output distributions. There are quite a few proportional association measures proposed in the literature (cf.
Parametric matroid of rough set
Rough set is mainly concerned with the approximations of objects through an equivalence relation on a universe. Matroid is a combinatorial generalization of linear independence in vector spaces. In this paper, we define a parametric set family, with any subset of a universe as its parameter, to connect rough sets and matroids. On the one hand, for a universe and an equivalence relation on the universe, a parametric set family is defined through the lower approximation operator. This parametric set family is proved to satisfy the independent set axiom of matroids, therefore it can generate a matroid, called a parametric matroid of the rough set. Three equivalent representations of the parametric set family are obtained. Moreover, the parametric matroid of the rough set is proved to be the direct sum of a partition-circuit matroid and a free matroid. On the other hand, since partition-circuit matroids were well studied through the lower approximation number, we use it to investigate the parametric matroid of the rough set. Several characteristics of the parametric matroid of the rough set, such as independent sets, bases, circuits, the rank function and the closure operator, are expressed by the lower approximation number.