Country
Semantic Vector Machines
We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an $d$-dimensional space, such that n-grams that are the translation of each other are close with respect to some metric. Good n-grams to n-grams translation results were achieved, but full sentences translation is still problematic. We realized that learning semantics of sentences and documents was the key for solving a lot of natural language processing problems, and thus moved to the second part of our work: sentence compression. We introduce a flexible neural network architecture for learning embeddings of words and sentences that extract their semantics, propose an efficient implementation in the Torch framework and present embedding results comparable to the ones obtained with classical neural language models, while being more powerful.
SAPFOCS: a metaheuristic based approach to part family formation problems in group technology
Ghosh, Tamal, Modak, Mousumi, Dan, Pranab K
This article deals with Part family formation problem which is believed to be moderately complicated to be solved in polynomial time in the vicinity of Group Technology (GT). In the past literature researchers investigated that the part family formation techniques are principally based on production flow analysis (PFA) which usually considers operational requirements, sequences and time. Part Coding Analysis (PCA) is merely considered in GT which is believed to be the proficient method to identify the part families. PCA classifies parts by allotting them to different families based on their resemblances in: (1) design characteristics such as shape and size, and/or (2) manufacturing characteristics (machining requirements). A novel approach based on simulated annealing namely SAPFOCS is adopted in this study to develop effective part families exploiting the PCA technique. Thereafter Taguchi's orthogonal design method is employed to solve the critical issues on the subject of parameters selection for the proposed metaheuristic algorithm. The adopted technique is therefore tested on 5 different datasets of size 5 {\times} 9 to 27 {\times} 9 and the obtained results are compared with C-Linkage clustering technique. The experimental results reported that the proposed metaheuristic algorithm is extremely effective in terms of the quality of the solution obtained and has outperformed C-Linkage algorithm in most instances.
Feedback Message Passing for Inference in Gaussian Graphical Models
Liu, Ying, Chandrasekaran, Venkat, Anandkumar, Animashree, Willsky, Alan S.
While loopy belief propagation (LBP) performs reasonably well for inference in some Gaussian graphical models with cycles, its performance is unsatisfactory for many others. In particular for some models LBP does not converge, and in general when it does converge, the computed variances are incorrect (except for cycle-free graphs for which belief propagation (BP) is non-iterative and exact). In this paper we propose {\em feedback message passing} (FMP), a message-passing algorithm that makes use of a special set of vertices (called a {\em feedback vertex set} or {\em FVS}) whose removal results in a cycle-free graph. In FMP, standard BP is employed several times on the cycle-free subgraph excluding the FVS while a special message-passing scheme is used for the nodes in the FVS. The computational complexity of exact inference is $O(k^2n)$, where $k$ is the number of feedback nodes, and $n$ is the total number of nodes. When the size of the FVS is very large, FMP is intractable. Hence we propose {\em approximate FMP}, where a pseudo-FVS is used instead of an FVS, and where inference in the non-cycle-free graph obtained by removing the pseudo-FVS is carried out approximately using LBP. We show that, when approximate FMP converges, it yields exact means and variances on the pseudo-FVS and exact means throughout the remainder of the graph. We also provide theoretical results on the convergence and accuracy of approximate FMP. In particular, we prove error bounds on variance computation. Based on these theoretical results, we design efficient algorithms to select a pseudo-FVS of bounded size. The choice of the pseudo-FVS allows us to explicitly trade off between efficiency and accuracy. Experimental results show that using a pseudo-FVS of size no larger than $\log(n)$, this procedure converges much more often, more quickly, and provides more accurate results than LBP on the entire graph.
On approximation of smoothing probabilities for hidden Markov models
We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that the corresponding risks converge.
Order-preserving factor analysis (OPFA)
Puig, Arnau Tibau, Hero, Alfred O. III
We present a novel factor analysis method that can be applied to the discovery of common factors shared among trajectories in multivariate time series data. These factors satisfy a precedence-ordering property: certain factors are recruited only after some other factors are activated. Precedence-ordering arise in applications where variables are activated in a specific order, which is unknown. The proposed method is based on a linear model that accounts for each factor's inherent delays and relative order. We present an algorithm to fit the model in an unsupervised manner using techniques from convex and non-convex optimization that enforce sparsity of the factor scores and consistent precedence-order of the factor loadings. We illustrate the Order-Preserving Factor Analysis (OPFA) method for the problem of extracting precedence-ordered factors from a longitudinal (time course) study of gene expression data.
Principal Graphs and Manifolds
Gorban, A. N., Zinovyev, A. Y.
In many physical, statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found 'lines and planes of closest fit to system of points'. The famous k-means algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the 'middle' of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and k-means principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.
Evaluating the diagnostic powers of variables and their linear combinations when the gold standard is continuous
Wang, Zhanfeng, Chang, Yuan-chin Ivan
The receiver operating characteristic (ROC) curve is a very useful tool for analyzing the diagnostic/classification power of instruments/classification schemes as long as a binary-scale gold standard is available. When the gold standard is continuous and there is no confirmative threshold, ROC curve becomes less useful. Hence, there are several extensions proposed for evaluating the diagnostic potential of variables of interest. However, due to the computational difficulties of these nonparametric based extensions, they are not easy to be used for finding the optimal combination of variables to improve the individual diagnostic power. Therefore, we propose a new measure, which extends the AUC index for identifying variables with good potential to be used in a diagnostic scheme. In addition, we propose a threshold gradient descent based algorithm for finding the best linear combination of variables that maximizes this new measure, which is applicable even when the number of variables is huge. The estimate of the proposed index and its asymptotic property are studied. The performance of the proposed method is illustrated using both synthesized and real data sets.
SpicyMKL
We propose a new optimization algorithm for Multiple Kernel Learning (MKL) called SpicyMKL, which is applicable to general convex loss functions and general types of regularization. The proposed SpicyMKL iteratively solves smooth minimization problems. Thus, there is no need of solving SVM, LP, or QP internally. SpicyMKL can be viewed as a proximal minimization method and converges super-linearly. The cost of inner minimization is roughly proportional to the number of active kernels. Therefore, when we aim for a sparse kernel combination, our algorithm scales well against increasing number of kernels. Moreover, we give a general block-norm formulation of MKL that includes non-sparse regularizations, such as elastic-net and \ellp -norm regularizations. Extending SpicyMKL, we propose an efficient optimization method for the general regularization framework. Experimental results show that our algorithm is faster than existing methods especially when the number of kernels is large (> 1000).
Solving Rubik's Cube Using SAT Solvers
Rubik's Cube is an easily-understood puzzle, which is originally called the "magic cube". It is a well-known planning problem, which has been studied for a long time. Yet many simple properties remain unknown. This paper studies whether modern SAT solvers are applicable to this puzzle. To our best knowledge, we are the first to translate Rubik's Cube to a SAT problem. To reduce the number of variables and clauses needed for the encoding, we replace a naive approach of 6 Boolean variables to represent each color on each facelet with a new approach of 3 or 2 Boolean variables. In order to be able to solve quickly Rubik's Cube, we replace the direct encoding of 18 turns with the layer encoding of 18-subtype turns based on 6-type turns. To speed up the solving further, we encode some properties of two-phase algorithm as an additional constraint, and restrict some move sequences by adding some constraint clauses. Using only efficient encoding cannot solve this puzzle. For this reason, we improve the existing SAT solvers, and develop a new SAT solver based on PrecoSAT, though it is suited only for Rubik's Cube. The new SAT solver replaces the lookahead solving strategy with an ALO (\emph{at-least-one}) solving strategy, and decomposes the original problem into sub-problems. Each sub-problem is solved by PrecoSAT. The empirical results demonstrate both our SAT translation and new solving technique are efficient. Without the efficient SAT encoding and the new solving technique, Rubik's Cube will not be able to be solved still by any SAT solver. Using the improved SAT solver, we can find always a solution of length 20 in a reasonable time. Although our solver is slower than Kociemba's algorithm using lookup tables, but does not require a huge lookup table.
Metamodel-based importance sampling for structural reliability analysis
Dubourg, V., Deheeger, F., Sudret, B.
Structural reliability methods aim at computing the probability of failure of systems with respect to some prescribed performance functions. In modern engineering such functions usually resort to running an expensive-to-evaluate computational model (e.g. a finite element model). In this respect simulation methods, which may require $10^{3-6}$ runs cannot be used directly. Surrogate models such as quadratic response surfaces, polynomial chaos expansions or kriging (which are built from a limited number of runs of the original model) are then introduced as a substitute of the original model to cope with the computational cost. In practice it is almost impossible to quantify the error made by this substitution though. In this paper we propose to use a kriging surrogate of the performance function as a means to build a quasi-optimal importance sampling density. The probability of failure is eventually obtained as the product of an augmented probability computed by substituting the meta-model for the original performance function and a correction term which ensures that there is no bias in the estimation even if the meta-model is not fully accurate. The approach is applied to analytical and finite element reliability problems and proves efficient up to 100 random variables.