Pattern Recognition
Building and displaying name relations using automatic unsupervised analysis of newspaper articles
Pouliquen, Bruno, Steinberger, Ralf, Ignat, Camelia, Oellinger, Tamara
We present a tool that, from automatically recognised names, tries to infer inter-person relations in order to present associated people on maps. Based on an in-house Named Entity Recognition tool, applied on clusters of an average of 15,000 news articles per day, in 15 different languages, we build a knowledge base that allows extracting statistical co-occurrences of persons and visualising them on a per-person page or in various graphs.
Geocoding multilingual texts: Recognition, disambiguation and visualisation
Pouliquen, Bruno, Kimler, Marco, Steinberger, Ralf, Ignat, Camelia, Oellinger, Tamara, Blackler, Ken, Fuart, Flavio, Zaghouani, Wajdi, Widiger, Anna, Forslund, Ann-Charlotte, Best, Clive
We are presenting a method to recognise geographical references in free text. Our tool must work on various languages with a mi ni-mum of language-dependent resources, except a gazetteer. The main difficulty is to disa mbiguate these place names by distinguis hing places from persons and by selecting the mo st likely place out of a list of homographi c place names world-wide. The system uses a number of language-independent clues and he uristics to disambiguate place name homogra phs. The final aim is to index texts with the countries and cities they mention and to automatically visualise this information on geographical maps using various tools.
Ontological Representations of Software Patterns
Rosengard, Jean-Marc, Ursu, Marian
This paper is based on and advocates the trend in software engineering of extending the use of software patterns as means of structuring solutions to software development problems (be they motivated by best practice or by company interests and policies). The paper argues that, on the one hand, this development requires tools for automatic organisation, retrieval and explanation of software patterns. On the other hand, that the existence of such tools itself will facilitate the further development and employment of patterns in the software development process. The paper analyses existing pattern representations and concludes that they are inadequate for the kind of automation intended here. Adopting a standpoint similar to that taken in the semantic web, the paper proposes that feasible solutions can be built on the basis of ontological representations.
Modular Adaptive System Based on a Multi-Stage Neural Structure for Recognition of 2D Objects of Discontinuous Production
This is a presentation of a new system for invariant recognition of 2D objects with overlapping classes, that can not be effectively recognized with the traditional methods. The translation, scale and partial rotation invariant contour object description is transformed in a DCT spectrum space. The obtained frequency spectrums are decomposed into frequency bands in order to feed different BPG neural nets (NNs). The NNs are structured in three stages - filtering and full rotation invariance; partial recognition; general classification. The designed multi-stage BPG Neural Structure shows very good accuracy and flexibility when tested with 2D objects used in the discontinuous production. The reached speed and the opportunuty for an easy restructuring and reprogramming of the system makes it suitable for application in different applied systems for real time work.
Pattern Recognition Theory of Mind
ABSTRACT I propose that pattern recognition, memorization and processing are key concepts that can be a principle set for the theoretical modeling of the mind function. Most of the questions about the mind functioning can be answered by a descriptive modeling and definitions from these principles. An understandable consciousness definition can be drawn based on the assumption that a pattern recognition system can recognize its own patterns of activity. The principles, descriptive modeling and definitions can be a basis for theoretical and applied research on cognitive sciences, particularly at artificial intelligence studies. Introduction The study of the mind needs overall accepted scientific basis from natural physical basis.
Multiple Hypothesis Testing in Pattern Discovery
Hanhijärvi, Sami, Puolamäki, Kai, Garriga, Gemma C.
The problem of multiple hypothesis testing arises when there are more than one hypothesis to be tested simultaneously for statistical significance. This is a very common situation in many data mining applications. For instance, assessing simultaneously the significance of all frequent itemsets of a single dataset entails a host of hypothesis, one for each itemset. A multiple hypothesis testing method is needed to control the number of false positives (Type I error). Our contribution in this paper is to extend the multiple hypothesis framework to be used with a generic data mining algorithm. We provide a method that provably controls the family-wise error rate (FWER, the probability of at least one false positive) in the strong sense. We evaluate the performance of our solution on both real and generated data. The results show that our method controls the FWER while maintaining the power of the test.
A Computational Model for the Alignment of Hierarchical Scene Representations in Human-Robot Interaction
Swadzba, Agnes (Bielefeld University) | Vorwerg, Constanze (Bielefeld University) | Wachsmuth, Sven (Bielefeld University) | Rickheit, Gert (Bielefeld University)
The ultimate goal of human-robot interaction is to enable the robot to seamlessly communicate with a human in a natural human-like fashion. Most work in this field concentrates on the speech interpretation and gesture recognition side assuming that a propositional scene representation is available. Less work was dedicated to the extraction of relevant scene structures that underlies these propositions. As a consequence, most approaches are restricted to place recognition or simple table top settings and do not generalize to more complex room setups. In this paper, we propose a hierarchical spatial model that is empirically motivated from psycholinguistic studies. Using this model the robot is able to extract scene structures from a time-of-flight depth sensor and adjust its spatial scene representation by taking verbal statements about partial scene aspects into account. Without assuming any pre-known model of the specific room, we show that the system aligns its sensor-based room representation to a semantically meaningful representation typically used by the human descriptor.
Boosting Constrained Mutual Subspace Method for Robust Image-set Based Object Recognition
Li, Xi (Xi'an Jiaotong University) | Fukui, Kazuhiro (Tsukuba University) | Zheng, Nanning (Xi’an Jiaotong University)
Object recognition using image-set or video sequence as input tends to be more robust since image-set or video sequence provides much more information than single snap-shot about the variability in the appearance of the target subject. Constrained Mutual Subspace Method (CMSM) is one of the state-of-the-art algorithms for imageset based object recognition by first projecting the image-set patterns onto the so-called generalized difference subspace then classifying based on the principal angle based mutual subspace distance. By treating the subspace bases for each image-set patterns as basic elements in the grassmann manifold, this paper presents a framework for robust image-set based recognition by CMSM based ensemble learning in a boosting way. The proposed Boosting Constrained Mutual Subspace Method(BCMSM) improves the original CMSM in the following ways: a) The proposed BCMSM algorithm is insensitive to the dimension of the generalized differnce subspace while the performance of the original CMSM algorithm is quite dependent on the dimension and the selecting of optimum choice is quite empirical and case-dependent; b) By taking advantage of both boosting and CMSM techniques, the generalization ability is improved and much higher classification performance can be achieved. Extensive experiments on real-life data sets (two face recognition tasks and one 3D object category classification task) show that the proposed method outperforms the previous state-of-the-art algorithms greatly in terms of classification accuracy.
Local Query Mining in a Probabilistic Prolog
Kimmig, Angelika (Katholieke Universiteit Leuven) | Raedt, Luc De (Katholieke Universiteit Leuven)
Local pattern mining is concerned with finding the set of patterns that satisfy a constraint in a database. We study local pattern mining in the context of ProbLog, a probabilistic Prolog system, and introduce an approach for finding correlated patterns in the form of queries in such a Prolog system. The approach combines principles of inductive logic programming, data mining and statistical relational learning. Experiments on a challenging biological network mining task provide evidence for the interestingness of the approach.
Mining Compressed Repetitive Gapped Sequential Patterns Efficiently
Tong, Yongxin, Zhao, Li, Yu, Dan, Ma, Shilong, Xu, Ke
Mining frequent sequential patterns from sequence databases has been a central research topic in data mining and various efficient mining sequential patterns algorithms have been proposed and studied. Recently, in many problem domains (e.g, program execution traces), a novel sequential pattern mining research, called mining repetitive gapped sequential patterns, has attracted the attention of many researchers, considering not only the repetition of sequential pattern in different sequences but also the repetition within a sequence is more meaningful than the general sequential pattern mining which only captures occurrences in different sequences. However, the number of repetitive gapped sequential patterns generated by even these closed mining algorithms may be too large to understand for users, especially when support threshold is low. In this paper, we propose and study the problem of compressing repetitive gapped sequential patterns. Inspired by the ideas of summarizing frequent itemsets, RPglobal, we develop an algorithm, CRGSgrow (Compressing Repetitive Gapped Sequential pattern grow), including an efficient pruning strategy, SyncScan, and an efficient representative pattern checking scheme, -dominate sequential pattern checking. The CRGSgrow is a two-step approach: in the first step, we obtain all closed repetitive sequential patterns as the candidate set of representative repetitive sequential patterns, and at the same time get the most of representative repetitive sequential patterns; in the second step, we only spend a little time in finding the remaining the representative patterns from the candidate set. An empirical study with both real and synthetic data sets clearly shows that the CRGSgrow has good performance.