Pattern Recognition
Abstract Representations and Frequent Pattern Discovery
We discuss the frequent pattern mining problem in a general setting. From an analysis of abstract representations, summarization and frequent pattern mining, we arrive at a generalization of the problem. Then, we show how the problem can be cast into the powerful language of algorithmic information theory. This allows us to formulate a simple algorithm to mine for all frequent patterns.
A Convergence Analysis of Log-Linear Training
Log-linear models are widely used probability models for statistical pattern recognition. Typically, log-linear models are trained according to a convex criterion. In recent years, the interest in log-linear models has greatly increased. The optimization of log-linear model parameters is costly and therefore an important topic, in particular for large-scale applications. Different optimization algorithms have been evaluated empirically in many papers. In this work, we analyze the optimization problem analytically and show that the training of log-linear models can be highly ill-conditioned. We verify our findings on two handwriting tasks. By making use of our convergence analysis, we obtain good results on a large-scale continuous handwriting recognition task with a simple and generic approach.
Pattern-Based Classification: A Unifying Perspective
Bringmann, Björn, Nijssen, Siegfried, Zimmermann, Albrecht
The use of patterns in predictive models is a topic that has received a lot of attention in recent years. Pattern mining can help to obtain models for structured domains, such as graphs and sequences, and has been proposed as a means to obtain more accurate and more interpretable models. Despite the large amount of publications devoted to this topic, we believe however that an overview of what has been accomplished in this area is missing. This paper presents our perspective on this evolving area. We identify the principles of pattern mining that are important when mining patterns for models and provide an overview of pattern-based classification methods. We categorize these methods along the following dimensions: (1) whether they post-process a pre-computed set of patterns or iteratively execute pattern mining algorithms; (2) whether they select patterns model-independently or whether the pattern selection is guided by a model. We summarize the results that have been obtained for each of these methods.
Revisiting Numerical Pattern Mining with Formal Concept Analysis
Kaytoue, Mehdi, Kuznetsov, Sergei O., Napoli, Amedeo
In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach.
8-Valent Fuzzy Logic for Iris Recognition and Biometry
Popescu-Bodorin, N., Balas, V. E., Motoc, I. M.
This paper shows that maintaining logical consistency of an iris recognition system is a matter of finding a suitable partitioning of the input space in enrollable and unenrollable pairs by negotiating the user comfort and the safety of the biometric system. In other words, consistent enrollment is mandatory in order to preserve system consistency. A fuzzy 3-valued disambiguated model of iris recognition is proposed and analyzed in terms of completeness, consistency, user comfort and biometric safety. It is also shown here that the fuzzy 3-valued model of iris recognition is hosted by an 8-valued Boolean algebra of modulo 8 integers that represents the computational formalization in which a biometric system (a software agent) can achieve the artificial understanding of iris recognition in a logically consistent manner.
Defining the Complexity of an Activity
Sahaf, Yasamin (Washington State University) | Krishnan, Narayanan Chatapuram (Washington State Univeristy) | Cook, Diane J. (Washington State University)
Activity recognition is a widely researched area with applications in health care, security and other domains. With each recognition system considering its own set of activities and sensors, it is difficult to compare the performance of these different systems and more importantly it makes the task of selecting an appropriate set of technologies and tools for recognizing an activity challenging. In this work-in-progress paper we attempt to characterize activities in terms of a complexity measure. We define activity complexity along three dimensions – sensing, computation and performance and illustrate different parameters that parameterize these dimensions. We look at grammars for representing activities and use grammar complexity as a measurement for activity complexity. Then we describe how these measurements can help evaluate the complexity of activities of daily living that are commonly considered by various researchers.
Deriving a Web-Scale Common Sense Fact Database
Tandon, Niket (Max Planck Institute for Informatics) | Melo, Gerard de (Max Planck Institute for Informatics) | Weikum, Gerhard (Max Planck Institute for Informatics)
The fact that birds have feathers and ice is cold seems trivially true. Yet, most machine-readable sources of knowledge either lack such common sense facts entirely or have only limited coverage. Prior work on automated knowledge base construction has largely focused on relations between named entities and on taxonomic knowledge, while disregarding common sense properties. In this paper, we show how to gather large amounts of common sense facts from Web n-gram data, using seeds from the ConceptNet collection. Our novel contributions include scalable methods for tapping onto Web-scale data and a new scoring model to determine which patterns and facts are most reliable. The experimental results show that this approach extends ConceptNet by many orders of magnitude at comparable levels of precision.
Learning Driving Behavior by Timed Syntactic Pattern Recognition
Verwer, Sicco (Katholieke Universiteit Leuven) | Weerdt, Mathijs de (Delft University of Technology) | Witteveen, Cees (Delft University of Technology)
The data at our disposal consists of onboard sensor measurements that have been collected from truck round-trips. We advocate the use of an explicit time representation By applying a simple discretization method, we obtain sequences in syntactic pattern recognition because it can of timed events. The behavior that is displayed in result in more succinct models and easier learning these sequences is unknown. From this data, we want to learn problems. We apply this approach to the real-world a model that we can use to monitor the driving behavior in problem of learning models for the driving behavior new data, i.e., to use it as a classifier. Our approach is to first of truck drivers. We discretize the values of learn a timed model from the unlabeled sequences using the onboard sensors into simple events.
Revisiting Numerical Pattern Mining with Formal Concept Analysis
Kaytoue, Mehdi (INRIA Nancy Grand Est - LORIA) | Kuznetsov, Sergei O. (Higher School of Economics - State University) | Napoli, Amedeo (CNRS)
We investigate the problem of mining numerical data with Formal Concept Analysis. The usual way is to use a scaling procedure —transforming numerical attributes into binary ones — leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and tested in an evaluation involving real-world data, showing the quality of the present approach.
Context-Sensitive Diagnosis of Discrete-Event Systems
Lamperti, Gianfranco (University of Brescia) | Zanella, Marina (University of Brescia)
Since the seminal work of Sampath et al. in 1996, despite the subsequent flourishing of techniques on diagnosis of discrete-event systems (DESs), the basic notions of fault and diagnosis have been remaining conceptually unchanged. Faults are defined at component level and diagnoses incorporate the occurrences of component faults within system evolutions: diagnosis is context-free. As this approach may be unsatisfactory for a complex DES, whose topology is organized in a hierarchy of abstractions, we propose to define different diagnosis rules for different subsystems in the hierarchy. Relevant fault patterns are specified as regular expressions on patterns of lower-level subsystems. Separation of concerns is achieved and the expressive power of diagnosis is enhanced: each subsystem has its proper set of diagnosis rules, which may or may not depend on the rules of other subsystems. Diagnosis is no longer anchored to components: it becomes context-sensitive. The approach yields seemingly contradictory but nonetheless possible scenarios: a subsystem can be normal despite the faulty behavior of a number of its components (positive paradox); also, it can be faulty despite the normal behavior of all its components (negative paradox).