Goto

Collaborating Authors

 Pattern Recognition


ERMO-DG: Evolving Region Moving Object Dataset Generator

AAAI Conferences

It is often essential to create datasets with foreseeable characteristics. For the design and testing of advanced spatiotemporal pattern mining algorithms,adaptable and large datasets are needed. In this paper, we present a synthetic dataset generator, ERMO-DG, that is intended for creating spatiotemporal patterns. Generated datasets consist of spatiotemporal object instances of different feature types, where these instances are represented by spatial regions evolving over time. The generator allows researchers to systematically create spatiotemporal datasets with predictable characteristics such as number of patterns, cardinality of patterns, velocity, acceleration, lifetime and spatialareas of instances.


Generative Modelling for Unsupervised Score Calibration

arXiv.org Machine Learning

ABSTRACT Score calibration enables automatic speaker recognizers to make cost-effective accept / reject decisions. Traditional calibration requires supervised data, which is an expensive resource. We propose a 2-component GMM for unsupervised calibration and demonstrate good performance relative to a supervised baseline on NIST SRE'10 and SRE'12. A Bayesian analysis demonstrates that the uncertainty associated with the unsupervised calibration parameter estimates is surprisingly small. Index Terms-- calibration, unsupervised learning, Laplace approximation, automatic speaker recognition 1. INTRODUCTION Automatic speaker recognizers map trials to scores.


A Survey on Metric Learning for Feature Vectors and Structured Data

arXiv.org Machine Learning

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.


A Statistical Learning Theory Framework for Supervised Pattern Discovery

arXiv.org Machine Learning

This paper formalizes a latent variable inference problem we call {\em supervised pattern discovery}, the goal of which is to find sets of observations that belong to a single ``pattern.'' We discuss two versions of the problem and prove uniform risk bounds for both. In the first version, collections of patterns can be generated in an arbitrary manner and the data consist of multiple labeled collections. In the second version, the patterns are assumed to be generated independently by identically distributed processes. These processes are allowed to take an arbitrary form, so observations within a pattern are not in general independent of each other. The bounds for the second version of the problem are stated in terms of a new complexity measure, the quasi-Rademacher complexity.


Seeded Graph Matching Via Joint Optimization of Fidelity and Commensurability

arXiv.org Machine Learning

Given two graphs, the graph matching problem (GMP) seeks to find a correspondence (i.e., "matching") between the vertex sets that best preserves similar substructures across the graphs. The graph matching problem has applications across many diverse disciplines including document processing, mathematical biology, network analysis and pattern recognition, to name a few. Unfortunately, no graph matching algorithm is known to be efficient. Indeed, even the easier problem of matching isomorphic simple graphs is of famously unknown complexity (see [7]). Because of its practical applicability, there exist numerous approximate graph matching algorithms in the literature; for an excellent survey of the existing literature, see [4].


Analogical Dissimilarity: Definition, Algorithms and Two Experiments in Machine Learning

arXiv.org Artificial Intelligence

This paper defines the notion of analogical dissimilarity between four objects, with a special focus on objects structured as sequences. Firstly, it studies the case where the four objects have a null analogical dissimilarity, i.e. are in analogical proportion. Secondly, when one of these objects is unknown, it gives algorithms to compute it. Thirdly, it tackles the problem of defining analogical dissimilarity, which is a measure of how far four objects are from being in analogical proportion. In particular, when objects are sequences, it gives a definition and an algorithm based on an optimal alignment of the four sequences. It gives also learning algorithms, i.e. methods to find the triple of objects in a learning sample which has the least analogical dissimilarity with a given object. Two practical experiments are described: the first is a classification problem on benchmarks of binary and nominal data, the second shows how the generation of sequences by solving analogical equations enables a handwritten character recognition system to rapidly be adapted to a new writer.


Gaussian Process Kernels for Pattern Discovery and Extrapolation

arXiv.org Artificial Intelligence

Gaussian processes are rich distributions over functions, which provide a Bayesian nonparametric approach to smoothing and interpolation. We introduce simple closed form kernels that can be used with Gaussian processes to discover patterns and enable extrapolation. These kernels are derived by modelling a spectral density -- the Fourier transform of a kernel -- with a Gaussian mixture. The proposed kernels support a broad class of stationary covariances, but Gaussian process inference remains simple and analytic. We demonstrate the proposed kernels by discovering patterns and performing long range extrapolation on synthetic examples, as well as atmospheric CO2 trends and airline passenger data. We also show that we can reconstruct standard covariances within our framework.


Generalizing Analytic Shrinkage for Arbitrary Covariance Structures

Neural Information Processing Systems

Analytic shrinkage is a statistical technique that offers a fast alternative to cross-validation for the regularization of covariance matrices and has appealing consistency properties. We show that the proof of consistency implies bounds on the growth rates of eigenvalues and their dispersion, which are often violated in data. We prove consistency under assumptions which do not restrict the covariance structure and therefore better match real world data. In addition, we propose an extension of analytic shrinkage --orthogonal complement shrinkage-- which adapts to the covariance structure. Finally we demonstrate the superior performance of our novel approach on data from the domains of finance, spoken letter and optical character recognition, and neuroscience.


The SP theory of intelligence: benefits and applications

arXiv.org Artificial Intelligence

Tel.: 44-1248-712962; 44-7746-290775 Received: 26 May 2013; in revised form: 13 December 2013 / Accepted: 13 December 2013 / Published: xx Abstract: This article describes existing and expected benefits of the SP theory of intelligence, and some potential applications. The theory aims to simplify and integrate ideas across artificial intelligence, mainstream computing, and human perception and cognition, with information compression as a unifying theme. It combines conceptual simplicity with descriptive and explanatory power across several areas of computing and cognition. In the SP machine--an expression of the SP theory which is currently realized in the form of a computer model--there is potential for an overall simplification of computing systems, including software. The SP theory promises deeper insights and better solutions in several areas of application including, most notably, unsupervised learning, natural language processing, autonomous robots, computer vision, intelligent databases, software engineering, information compression, medical diagnosis and big data. There is also potential in areas such as the semantic web, bioinformatics, structuring of documents, the detection of computer viruses, data fusion, new kinds of computer, and the development of scientific theories. The theory promises seamless integration of structures and functions within and between different areas of application. The potential value, worldwide, of these benefits and applications is at least $190 billion each year. Further development would be facilitated by the creation of a high-parallel, open-source version of the SP machine, available to researchers everywhere. Keywords: artificial intelligence; information compression; unsupervised learning; natural language processing; pattern recognition Information 2013, xx 2 1. Introduction The SP theory of intelligence aims to simplify and integrate concepts across artificial intelligence, mainstream computing and human perception and cognition, with information compression as a unifying theme. This article describes existing and expected benefits of the SP theory and some of its potential applications. The theory is described most fully in [1] and more briefly in an extended overview [2]. This article should be read in conjunction with either or both of those accounts. In brief, the existing and expected benefits of the theory are: - Conceptual simplicity combined with descriptive and explanatory power.


Pattern recognition issues on anisotropic smoothed particle hydrodynamics

arXiv.org Artificial Intelligence

This is a preliminary theoretical discussion on the computational requirements of the state of the art smoothed particle hydrodynamics (SPH) from the optics of pattern recognition and artificial intelligence. It is pointed out in the present paper that, when including anisotropy detection to improve resolution on shock layer, SPH is a very peculiar case of unsupervised machine learning. On the other hand, the free particle nature of SPH opens an opportunity for artificial intelligence to study particles as agents acting in a collaborative framework in which the timed outcomes of a fluid simulation forms a large knowledge base, which might be very attractive in computational astrophysics phenomenological problems like self-propagating star formation.