Asia
LB2CO: A Semantic Ontology Framework for B2C eCommerce Transaction on the Internet
Business ontology can enhance the successful development of complex enterprise system; this is being achieved through knowledge sharing and the ease of communication between every entity in the domain. Through human semantic interaction with the web resources, machines to interpret the data published in a machine interpretable form under web. However, the theoretical practice of business ontology in eCommerce domain is quite a few especially in the section of electronic transaction, and the various techniques used to obtain efficient communication across spheres are error prone and are not always guaranteed to be efficient in obtaining desired result due to poor semantic integration between entities. To overcome the poor semantic integration this research focuses on proposed ontology called LB2CO, which combines the framework of IDEF5 & SNAP as an analysis tool, for automated recommendation of product and services and create effective ontological framework for B2C transaction & communication across different business domains that facilitates the interoperability & integration of B2C transactions over the web.
Geometric lattice structure of covering and its application to attribute reduction through matroids
The reduction of covering decision systems is an important problem in data mining, and covering-based rough sets serve as an efficient technique to process the problem. Geometric lattices have been widely used in many fields, especially greedy algorithm design which plays an important role in the reduction problems. Therefore, it is meaningful to combine coverings with geometric lattices to solve the optimization problems. In this paper, we obtain geometric lattices from coverings through matroids and then apply them to the issue of attribute reduction. First, a geometric lattice structure of a covering is constructed through transversal matroids. Then its atoms are studied and used to describe the lattice. Second, considering that all the closed sets of a finite matroid form a geometric lattice, we propose a dependence space through matroids and study the attribute reduction issues of the space, which realizes the application of geometric lattices to attribute reduction. Furthermore, a special type of information system is taken as an example to illustrate the application. In a word, this work points out an interesting view, namely, geometric lattice to study the attribute reduction issues of information systems.
Generalization Bounds for Representative Domain Adaptation
Zhang, Chao, Zhang, Lei, Fan, Wei, Ye, Jieping
In this paper, we propose a novel framework to analyze the theoretical properties of the learning process for a representative type of domain adaptation, which combines data from multiple sources and one target (or briefly called representative domain adaptation). In particular, we use the integral probability metric to measure the difference between the distributions of two domains and meanwhile compare it with the H-divergence and the discrepancy distance. We develop the Hoeffding-type, the Bennett-type and the McDiarmid-type deviation inequalities for multiple domains respectively, and then present the symmetrization inequality for representative domain adaptation. Next, we use the derived inequalities to obtain the Hoeffding-type and the Bennett-type generalization bounds respectively, both of which are based on the uniform entropy number. Moreover, we present the generalization bounds based on the Rademacher complexity. Finally, we analyze the asymptotic convergence and the rate of convergence of the learning process for representative domain adaptation. We discuss the factors that affect the asymptotic behavior of the learning process and the numerical experiments support our theoretical findings as well. Meanwhile, we give a comparison with the existing results of domain adaptation and the classical results under the same-distribution assumption.
Direct Learning of Sparse Changes in Markov Networks by Density Ratio Estimation
Liu, Song, Quinn, John A., Gutmann, Michael U., Suzuki, Taiji, Sugiyama, Masashi
We propose a new method for detecting changes in Markov network structure between two sets of samples. Instead of naively fitting two Markov network models separately to the two data sets and figuring out their difference, we \emph{directly} learn the network structure change by estimating the ratio of Markov network models. This density-ratio formulation naturally allows us to introduce sparsity in the network structure change, which highly contributes to enhancing interpretability. Furthermore, computation of the normalization term, which is a critical bottleneck of the naive approach, can be remarkably mitigated. We also give the dual formulation of the optimization problem, which further reduces the computation cost for large-scale Markov networks. Through experiments, we demonstrate the usefulness of our method.
A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles
Shin, Jinwoo, Gelfand, Andrew E., Chertkov, Misha
Max-product ‘belief propagation’ (BP) is a popular distributed heuristic for finding the Maximum A Posteriori (MAP) assignment in a joint probability distribution represented by a Graphical Model (GM). It was recently shown that BP converges to the correct MAP assignment for a class of loopy GMs with the following common feature: the Linear Programming (LP) relaxation to the MAP problem is tight (has no integrality gap). Unfortunately, tightness of the LP relaxation does not, in general, guarantee convergence and correctness of the BP algorithm. The failure of BP in such cases motivates reverse engineering a solution – namely, given a tight LP, can we design a ‘good’ BP algorithm. In this paper, we design a BP algorithm for the Maximum Weight Matching (MWM) problem over general graphs. We prove that the algorithm converges to the correct optimum if the respective LP relaxation, which may include inequalities associated with non-intersecting odd-sized cycles, is tight. The most significant part of our approach is the introduction of a novel graph transformation designed to force convergence of BP. Our theoretical result suggests an efficient BP-based heuristic for the MWM problem, which consists of making sequential, “cutting plane”, modifications to the underlying GM. Our experiments show that this heuristic performs as well as traditional cutting-plane algorithms using LP solvers on MWM problems.
Nonparametric Multi-group Membership Model for Dynamic Networks
Kim, Myunghwan, Leskovec, Jure
Statistical analysis of social networks and other relational data is becoming an increasingly important problem as the scope and availability of network data increases. Network data--such as the friendships in a social network--is often dynamic in a sense that relations between entities rise and decay over time. A fundamental problem in the analysis of such dynamic network data is to extract a summary of the common structure and the dynamics of the underlying relations between entities. Accurate models of structure and dynamics of network data have many applications. They allow us to predict missing relationships [20, 21, 23], recommend potential new relations [2], identify clusters and groups of nodes [1, 29], forecast future links [4, 9, 11, 24], and even predict group growth and longevity [15]. Here we present a new approach to modeling network dynamics by considering time-evolving interactions between groups of nodes as well as the arrival and departure dynamics of individual nodes to these groups. We develop a dynamic network model, Dynamic Multi-group Membership Graph Model, that identifies the birth and death of individual groups as well as the dynamics of node joining and leaving groups in order to explain changes in the underlying network linking structure. Our nonparametric model considers an infinite number of latent groups, where each node can belong to multiple groups simultaneously. We capture the evolution of individual node group memberships via a Factorial Hidden Markov model.
Low-rank matrix reconstruction and clustering via approximate message passing
Matsushita, Ryosuke, Tanaka, Toshiyuki
We study the problem of reconstructing low-rank matrices from their noisy observations. We formulate the problem in the Bayesian framework, which allows us to exploit structural properties of matrices in addition to low-rankedness, such as sparsity. We propose an efficient approximate message passing algorithm, derived from the belief propagation algorithm, to perform the Bayesian inference for matrix reconstruction. We have also successfully applied the proposed algorithm to a clustering problem, by formulating the problem of clustering as a low-rank matrix reconstruction problem with an additional structural property. Numerical experiments show that the proposed algorithm outperforms Lloyd's K-means algorithm.
Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture
Campbell, Trevor, Liu, Miao, Kulis, Brian, How, Jonathan P., Carin, Lawrence
This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets.
k-Prototype Learning for 3D Rigid Structures
Ding, Hu, Berezney, Ronald, Xu, Jinhui
In this paper, we study the following new variant of prototype learning, called {\em $k$-prototype learning problem for 3D rigid structures}: Given a set of 3D rigid structures, find a set of $k$ rigid structures so that each of them is a prototype for a cluster of the given rigid structures and the total cost (or dissimilarity) is minimized. Prototype learning is a core problem in machine learning and has a wide range of applications in many areas. Existing results on this problem have mainly focused on the graph domain. In this paper, we present the first algorithm for learning multiple prototypes from 3D rigid structures. Our result is based on a number of new insights to rigid structures alignment, clustering, and prototype reconstruction, and is practically efficient with quality guarantee. We validate our approach using two type of data sets, random data and biological data of chromosome territories. Experiments suggest that our approach can effectively learn prototypes in both types of data.
Phase Retrieval using Alternating Minimization
Netrapalli, Praneeth, Jain, Prateek, Sanghavi, Sujay
Phase retrieval problems involve solving linear equations, but with missing sign (or phase, for complex numbers) information. Over the last two decades, a popular generic empirical approach to the many variants of this problem has been one of alternating minimization; i.e. alternating between estimating the missing phase information, and the candidate solution. In this paper, we show that a simple alternating minimization algorithm geometrically converges to the solution of one such problem -- finding a vector $x$ from $y,A$, where $y = |A'x|$ and $|z|$ denotes a vector of element-wise magnitudes of $z$ -- under the assumption that $A$ is Gaussian. Empirically, our algorithm performs similar to recently proposed convex techniques for this variant (which are based on lifting" to a convex matrix problem) in sample complexity and robustness to noise. However, our algorithm is much more efficient and can scale to large problems. Analytically, we show geometric convergence to the solution, and sample complexity that is off by log factors from obvious lower bounds. We also establish close to optimal scaling for the case when the unknown vector is sparse. Our work represents the only known proof of alternating minimization for any variant of phase retrieval problems in the non-convex setting."