Goto

Collaborating Authors

 Country


Principal Graphs and Manifolds

arXiv.org Machine Learning

In many physical, statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found 'lines and planes of closest fit to system of points'. The famous k-means algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the 'middle' of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and k-means principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.


SpicyMKL

arXiv.org Machine Learning

We propose a new optimization algorithm for Multiple Kernel Learning (MKL) called SpicyMKL, which is applicable to general convex loss functions and general types of regularization. The proposed SpicyMKL iteratively solves smooth minimization problems. Thus, there is no need of solving SVM, LP, or QP internally. SpicyMKL can be viewed as a proximal minimization method and converges super-linearly. The cost of inner minimization is roughly proportional to the number of active kernels. Therefore, when we aim for a sparse kernel combination, our algorithm scales well against increasing number of kernels. Moreover, we give a general block-norm formulation of MKL that includes non-sparse regularizations, such as elastic-net and \ellp -norm regularizations. Extending SpicyMKL, we propose an efficient optimization method for the general regularization framework. Experimental results show that our algorithm is faster than existing methods especially when the number of kernels is large (> 1000).


Evaluating the diagnostic powers of variables and their linear combinations when the gold standard is continuous

arXiv.org Machine Learning

The receiver operating characteristic (ROC) curve is a very useful tool for analyzing the diagnostic/classification power of instruments/classification schemes as long as a binary-scale gold standard is available. When the gold standard is continuous and there is no confirmative threshold, ROC curve becomes less useful. Hence, there are several extensions proposed for evaluating the diagnostic potential of variables of interest. However, due to the computational difficulties of these nonparametric based extensions, they are not easy to be used for finding the optimal combination of variables to improve the individual diagnostic power. Therefore, we propose a new measure, which extends the AUC index for identifying variables with good potential to be used in a diagnostic scheme. In addition, we propose a threshold gradient descent based algorithm for finding the best linear combination of variables that maximizes this new measure, which is applicable even when the number of variables is huge. The estimate of the proposed index and its asymptotic property are studied. The performance of the proposed method is illustrated using both synthesized and real data sets.


Metamodel-based importance sampling for structural reliability analysis

arXiv.org Machine Learning

Structural reliability methods aim at computing the probability of failure of systems with respect to some prescribed performance functions. In modern engineering such functions usually resort to running an expensive-to-evaluate computational model (e.g. a finite element model). In this respect simulation methods, which may require $10^{3-6}$ runs cannot be used directly. Surrogate models such as quadratic response surfaces, polynomial chaos expansions or kriging (which are built from a limited number of runs of the original model) are then introduced as a substitute of the original model to cope with the computational cost. In practice it is almost impossible to quantify the error made by this substitution though. In this paper we propose to use a kriging surrogate of the performance function as a means to build a quasi-optimal importance sampling density. The probability of failure is eventually obtained as the product of an augmented probability computed by substituting the meta-model for the original performance function and a correction term which ensures that there is no bias in the estimation even if the meta-model is not fully accurate. The approach is applied to analytical and finite element reliability problems and proves efficient up to 100 random variables.


Self-organized adaptation of a simple neural circuit enables complex robot behaviour

arXiv.org Artificial Intelligence

Controlling sensori-motor systems in higher animals or complex robots is a challenging combinatorial problem, because many sensory signals need to be simultaneously coordinated into a broad behavioural spectrum. To rapidly interact with the environment, this control needs to be fast and adaptive. Current robotic solutions operate with limited autonomy and are mostly restricted to few behavioural patterns. Here we introduce chaos control as a new strategy to generate complex behaviour of an autonomous robot. In the presented system, 18 sensors drive 18 motors via a simple neural control circuit, thereby generating 11 basic behavioural patterns (e.g., orienting, taxis, self-protection, various gaits) and their combinations. The control signal quickly and reversibly adapts to new situations and additionally enables learning and synaptic long-term storage of behaviourally useful motor responses. Thus, such neural control provides a powerful yet simple way to self-organize versatile behaviours in autonomous agents with many degrees of freedom.


Machine-Part cell formation through visual decipherable clustering of Self Organizing Map

arXiv.org Artificial Intelligence

Machine-part cell formation is used in cellular manufacturing in order to process a large variety, quality, lower work in process levels, reducing manufacturing lead-time and customer response time while retaining flexibility for new products. This paper presents a new and novel approach for obtaining machine cells and part families. In the cellular manufacturing the fundamental problem is the formation of part families and machine cells. The present paper deals with the Self Organising Map (SOM) method an unsupervised learning algorithm in Artificial Intelligence, and has been used as a visually decipherable clustering tool of machine-part cell formation. The objective of the paper is to cluster the binary machine-part matrix through visually decipherable cluster of SOM color-coding and labelling via the SOM map nodes in such a way that the part families are processed in that machine cells. The Umatrix, component plane, principal component projection, scatter plot and histogram of SOM have been reported in the present work for the successful visualization of the machine-part cell formation. Computational result with the proposed algorithm on a set of group technology problems available in the literature is also presented. The proposed SOM approach produced solutions with a grouping efficacy that is at least as good as any results earlier reported in the literature and improved the grouping efficacy for 70% of the problems and found immensely useful to both industry practitioners and researchers.


Multi-task GLOH feature selection for human age estimation

arXiv.org Artificial Intelligence

In this paper, we propose a novel age estimation method based on GLOH feature descriptor and multi-task learning (MTL). The GLOH feature descriptor, one of the state-of-the-art feature descriptor, is used to capture the age-related local and spatial information of face image. As the exacted GLOH features are often redundant, MTL is designed to select the most informative feature bins for age estimation problem, while the corresponding weights are determined by ridge regression. This approach largely reduces the dimensions of feature, which can not only improve performance but also decrease the computational burden. Experiments on the public available FG-NET database show that the proposed method can achieve comparable performance over previous approaches while using much fewer features.


Contextual hypotheses and semantics of logic programs

arXiv.org Artificial Intelligence

Logic programming has developed as a rich field, built over a logical substratum whose main constituent is a nonclassical form of negation, sometimes coexisting with classical negation. The field has seen the advent of a number of alternative semantics, with Kripke-Kleene semantics, the well-founded semantics, the stable model semantics, and the answer-set semantics standing out as the most successful. We show that all aforementioned semantics are particular cases of a generic semantics, in a framework where classical negation is the unique form of negation and where the literals in the bodies of the rules can be `marked' to indicate that they can be the targets of hypotheses. A particular semantics then amounts to choosing a particular marking scheme and choosing a particular set of hypotheses. When a literal belongs to the chosen set of hypotheses, all marked occurrences of that literal in the body of a rule are assumed to be true, whereas the occurrences of that literal that have not been marked in the body of the rule are to be derived in order to contribute to the firing of the rule. Hence the notion of hypothetical reasoning that is presented in this framework is not based on making global assumptions, but more subtly on making local, contextual assumptions, taking effect as indicated by the chosen marking scheme on the basis of the chosen set of hypotheses. Our approach offers a unified view on the various semantics proposed in logic programming, classical in that only classical negation is used, and links the semantics of logic programs to mechanisms that endow rule-based systems with the power to harness hypothetical reasoning.


Bisimulations for fuzzy automata

arXiv.org Artificial Intelligence

Bisimulations have been widely used in many areas of computer science to model equivalence between various systems, and to reduce the number of states of these systems, whereas uniform fuzzy relations have recently been introduced as a means to model the fuzzy equivalence between elements of two possible different sets. Here we use the conjunction of these two concepts as a powerful tool in the study of equivalence between fuzzy automata. We prove that a uniform fuzzy relation between fuzzy automata $\cal A$ and $\cal B$ is a forward bisimulation if and only if its kernel and co-kernel are forward bisimulation fuzzy equivalences on $\cal A$ and $\cal B$ and there is a special isomorphism between factor fuzzy automata with respect to these fuzzy equivalences. As a consequence we get that fuzzy automata $\cal A$ and $\cal B$ are UFB-equivalent, i.e., there is a uniform forward bisimulation between them, if and only if there is a special isomorphism between the factor fuzzy automata of $\cal A$ and $\cal B$ with respect to their greatest forward bisimulation fuzzy equivalences. This result reduces the problem of testing UFB-equivalence to the problem of testing isomorphism of fuzzy automata, which is closely related to the well-known graph isomorphism problem. We prove some similar results for backward-forward bisimulations, and we point to fundamental differences. Because of the duality with the studied concepts, backward and forward-backward bisimulations are not considered separately. Finally, we give a comprehensive overview of various concepts on deterministic, nondeterministic, fuzzy, and weighted automata, which are related to bisimulations.


Interpreting Graph Cuts as a Max-Product Algorithm

arXiv.org Machine Learning

The maximum a posteriori (MAP) configuration of binary variable models with submodular graph-structured energy functions can be found efficiently and exactly by graph cuts. Max-product belief propagation (MP) has been shown to be suboptimal on this class of energy functions by a canonical counterexample where MP converges to a suboptimal fixed point (Kulesza & Pereira, 2008). In this work, we show that under a particular scheduling and damping scheme, MP is equivalent to graph cuts, and thus optimal. We explain the apparent contradiction by showing that with proper scheduling and damping, MP always converges to an optimal fixed point. Thus, the canonical counterexample only shows the suboptimality of MP with a particular suboptimal choice of schedule and damping. With proper choices, MP is optimal.