Goto

Collaborating Authors

 Pattern Recognition


Unsupervised Learning through Prediction in a Model of Cortex

arXiv.org Machine Learning

Human infants can do some amazing things, and so can computers, but there seems to be almost no intersection or direct connection between these two spheres of accomplishment. In Computer Science we model computation through algorithms and running times, but such modeling quickly leads to intractability, even when applied to tasks that are very easy for humans. The algorithms we invent are clever, complex and sophisticated, and yet they work in fashions that seem completely incompatible with our understanding of the ways in which the brain must actually work -- and this includes learning algorithms. Accelerating advances in neuroscience have expanded tremendously our understanding of the brain, its neurons and their synapses, mechanisms, and connections, and yet no overarching theory appears to be emerging of brain function and the genesis of the mind. As far as we know, and the spectacular successes of neural networks [5, 8] notwithstanding, no algorithm has been proposed which solves some nontrivial computational problem in a computational fashion and style that can be credibly claimed to reflect what is happening in the brain when the same problem is solved.


Modeling and Recognition of Smart Grid Faults by a Combined Approach of Dissimilarity Learning and One-Class Classification

arXiv.org Artificial Intelligence

Detecting faults in electrical power grids is of paramount importance, either from the electricity operator and consumer viewpoints. Modern electric power grids (smart grids) are equipped with smart sensors that allow to gather real-time information regarding the physical status of all the component elements belonging to the whole infrastructure (e.g., cables and related insulation, transformers, breakers and so on). In real-world smart grid systems, usually, additional information that are related to the operational status of the grid itself are collected such as meteorological information. Designing a suitable recognition (discrimination) model of faults in a real-world smart grid system is hence a challenging task. This follows from the heterogeneity of the information that actually determine a typical fault condition. The second point is that, for synthesizing a recognition model, in practice only the conditions of observed faults are usually meaningful. Therefore, a suitable recognition model should be synthesized by making use of the observed fault conditions only. In this paper, we deal with the problem of modeling and recognizing faults in a real-world smart grid system, which supplies the entire city of Rome, Italy. Recognition of faults is addressed by following a combined approach of multiple dissimilarity measures customization and one-class classification techniques. We provide here an in-depth study related to the available data and to the models synthesized by the proposed one-class classifier. We offer also a comprehensive analysis of the fault recognition results by exploiting a fuzzy set based reliability decision rule.


Separating the Real from the Synthetic: Minutiae Histograms as Fingerprints of Fingerprints

arXiv.org Artificial Intelligence

In this study we show that by the current state-of-the-art synthetically generated fingerprints can easily be discriminated from real fingerprints. We propose a method based on second order extended minutiae histograms (MHs) which can distinguish between real and synthetic prints with very high accuracy. MHs provide a fixed-length feature vector for a fingerprint which are invariant under rotation and translation. This 'test of realness' can be applied to synthetic fingerprints produced by any method. In this work, tests are conducted on the 12 publicly available databases of FVC2000, FVC2002 and FVC2004 which are well established benchmarks for evaluating the performance of fingerprint recognition algorithms; 3 of these 12 databases consist of artificial fingerprints generated by the SFinGe software. Additionally, we evaluate the discriminative performance on a database of synthetic fingerprints generated by the software of Bicz versus real fingerprint images. We conclude with suggestions for the improvement of synthetic fingerprint generation.


Sequential Pattern Mining in StarCraft: Brood War for Short and Long-Term Goals

AAAI Conferences

A wide variety of strategies have been used to create agents in the growing field of real-time strategy AI. However, a frequent problem is the necessity of hand-crafting competencies, which becomes prohibitively difficult in a large space with many corner cases. A preferable approach would be to learn these competencies from the wealth of expert play available. We present a system that uses the Generalized Sequential Pattern (GSP) algorithm from data mining to find common patterns in StarCraft:Brood War replays at both the micro- and macro-level, and verify that these correspond to human understandings of expert play. In the future, we hope to use these patterns to learn tasks and goals in an unsupervised manner for an HTN planner.


PaTux: An Authoring Tool for Level Design through Pattern Customisation Using Non-Negative Matrix Factorization

AAAI Conferences

We present a demonstration of PaTux, an authoring tool for designing levels in SuperTux game through combining patterns. PaTux allows game designers to specify the design of their levels using patterns extracted from training level samples. The Non-negative Matrix Factorisation (NMF) method is utilised to approximate pattern and weight matrices from the training data. The patterns are visualised for designers to choose from and the changes made on the level structure are visualised in realtime. The designer can also specify the weight of each pattern permitting exploration of a wider variety. The data used to train the model can also be specified by the designer resulting in learning a new set of patterns. The system also suggests variations for a given design. When the designer is satisfied with the design, the system allows loading the resultant level in the game to be played.


Pattern Discovery in Protein Networks Reveals High-Confidence Predictions of Novel Interactions

AAAI Conferences

Pattern discovery in protein interaction networks can reveal crucial biological knowledge on the inner workings of cellular machinery. Although far from complete, extracting meaningful patterns from proteomic networks is a non-trivial task due to their size-complexity. This paper proposes a computational framework to efficiently discover topologically-similar patterns from large proteomic networks using Particle Swarm Optimization (PSO). PSO is a robust and low-cost optimization technique that demonstrated to work effectively on the complex, mostly sparse proteomic networks. The resulting topologically-similar patterns of close proximity are utilized to systematically predict new high-confidence protein-protein interactions (PPIs). The proposed PSO-based PPI prediction method (3PI) managed to predict high-confidence PPIs, validated by more than one computational/experimental source, through a proposed PPI knowledge transfer process between topologically-similar interaction patterns of close proximity. In three case studies, over 50% of the predicted interactions for EFGR, ERBB2, ERBB3, GRB2 and UBC are overlapped with publically available interaction databases, ~80% of the predictions are found among the Top 1% results of another PPI prediction method and their genes are significantly co-expressed across different tissues. Moreover, the only single prediction example that did not overlap with any of our validation sources was recently experimentally supported by two PubMed publications.


Sub-Selective Quantization for Large-Scale Image Search

AAAI Conferences

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping highdimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to two popular quantization techniques PCA Quantization (PCAQ) and Iterative Quantization (ITQ). Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.


An Open Source Pattern Recognition Toolbox for MATLAB

arXiv.org Machine Learning

Pattern recognition and machine learning are becoming integral parts of algorithms in a wide range of applications. Different algorithms and approaches for machine learning include different tradeoffs between performance and computation, so during algorithm development it is often necessary to explore a variety of different approaches to a given task. A toolbox with a unified framework across multiple pattern recognition techniques enables algorithm developers the ability to rapidly evaluate different choices prior to deployment. MATLAB is a widely used environment for algorithm development and prototyping, and although several MATLAB toolboxes for pattern recognition are currently available these are either incomplete, expensive, or restrictively licensed. In this work we describe a MATLAB toolbox for pattern recognition and machine learning known as the PRT (Pattern Recognition Toolbox), licensed under the permissive MIT license. The PRT includes many popular techniques for data preprocessing, supervised learning, clustering, regression and feature selection, as well as a methodology for combining these components using a simple, uniform syntax. The resulting algorithms can be evaluated using cross-validation and a variety of scoring metrics to ensure robust performance when the algorithm is deployed. This paper presents an overview of the PRT as well as an example of usage on Fisher's Iris dataset.


Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

arXiv.org Artificial Intelligence

IBM Research, T.J. Watson Research Center, Yorktown Heights, NY Abstract The computational performance of the biological brain has long attracted significant interest and has led to inspirations in operating principles, algorithms, and architectures for computing and signal processing. In this work, we focus on hardware implementation of brain-like learning in a brain-inspired architecture. We demonstrate, in hardware, that 2-D crossbar arrays of phase change synaptic devices can achieve associative learning and perform pattern recognition. Device and array-level studies using an experimental 10 10 array of phase change synaptic devices have shown that pattern recognition is robust against synaptic resistance variations and large variations can be tolerated by increasing the number of training iterations. Our measurements show that increase in initial variation from 9 % to 60 % causes required training iterations to increase from 1 to 11. I. Introduction Synaptic electronics is an emerging field of research aiming to build electronic systems that mimic computational energyefficiency and fault tolerance of biological brain in a compact space [1]. Figure 1: Left figure is a DSI (diffusion spectrum imaging) scan showing a fabric-like 3-D grid structure of connections in the monkey brain (Credit: Van Wedeen, M.D., Martinos Center and Dept. of Radiology, Massachusetts General Hospital and Harvard University Medical School) [6].


Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech Processing

arXiv.org Machine Learning

The trimming scheme with a prefixed cutoff portion is known as a method of improving the robustness of statistical models such as multivariate Gaussian mixture models (MG- MMs) in small scale tests by alleviating the impacts of outliers. However, when this method is applied to real- world data, such as noisy speech processing, it is hard to know the optimal cut-off portion to remove the outliers and sometimes removes useful data samples as well. In this paper, we propose a new method based on measuring the dispersion degree (DD) of the training data to avoid this problem, so as to realise automatic robust estimation for MGMMs. The DD model is studied by using two different measures. For each one, we theoretically prove that the DD of the data samples in a context of MGMMs approximately obeys a specific (chi or chi-square) distribution. The proposed method is evaluated on a real-world application with a moderately-sized speaker recognition task. Experiments show that the proposed method can significantly improve the robustness of the conventional training method of GMMs for speaker recognition.