Pattern Recognition
Learning and Detecting Patterns in Multi-Attributed Network Data
Levchuk, Georgiy (Aptima, Inc.) | Roberts, Jennifer (Aptima, Inc.) | Freeman, Jared (Aptima, Inc.)
Network analysis is a growing field across many domains, including computer vision, social media marketing, transportation networks, and intelligence analysis. The growing use of digital communication devices and platforms, as well as persistent surveillance sensors, has resulted in explosion of the quantity of data and stretched the abilities of current technologies to process this data and draw meaningful conclusions. Current tools either require significant levels of manual intervention (e.g., to prepare the data, to define patterns, or to draw conclusions from data) or are unable to generalize to new data sources and analysis needs. In this paper, we present automated solutions to two major problems in network analysis: (a) finding patterns in the network data that contains high levels of noise and irrelevant information; and (b) learning repetitive patterns and dependencies between entities and attributes. Our modeling framework represents network data using multi-attributed graphs that can encode various discrete and continuous features and relationships between network entities. The pattern search and learning model is based on probabilistic multi-attributed graph matching, and implemented using distributed message passing algorithms. Our algorithms achieved high accuracy rates in learning and finding patterns in the data, are flexible to new domains and data types, and scale to large datasets using the Map-Reduce framework.
A General Methodology for the Determination of 2D Bodies Elastic Deformation Invariants. Application to the Automatic Identification of Parasites
Arabadjis, Dimitris, Rousopoulos, Panayiotis, Papaodysseus, Constantin, Panagopoulos, Michalis, Loumou, Panayiota, Theodoropoulos, Georgios
--A novel methodology is introduced here that exploits 2D images of arbitrary elastic body deformation instances, so as to quantify mechano-elastic characteristics that are deformation invariant. Determination of such characteristics allows for developing methods offering an image of the undeformed body . General assumptions about the mechano-elastic properties of the bodies are stated, which lead to two different approaches for obtaining bodies' deformation invariants. One was developed to spot deformed body's neutral line and its cross sections, while the other solves deformation PDEs by performing a set of equivalent image operations on the deformed body images. Both these processes may furnish a body undeformed version from its deformed image. This was confirmed by obtaining the undeformed shape of deformed parasites, cells (protozoa), fibers and human lips. In addition, the method has been applied to the important problem of parasite automatic classification from their microscopic images. T o achieve this, we first apply the previous method to straighten the highly deformed parasites and then we apply a dedicated curve classification method to the straightened parasite contours. It is demonstrated that essentially different deformations of the same parasite give rise to practically the same undeformed shape, thus confirming the consistency of the introduced methodology . Finally, the developed pattern recognition method classifies the unwrapped parasites into 6 families, with an accuracy rate of 97.6 %. Index Terms --deformation invariant elastic properties, automatic curve classification, parasite automatic identification, straightening deformed objects, image analysis, elastic deformation, pattern classification techniques. In these cases, one frequently encounters two important problems: a) to make consistent and reliable estimation of the body undeformed shape from images of random instances of body deformation and b) to identify the deformed body from these images. W e would like to emphasize that, as a rule, identification of bodies on the basis of images of their deformation, is practically prohibited by the randomness of the deformation.
Mining Permission Request Patterns from Android and Facebook Applications (extended author version)
Frank, Mario, Dong, Ben, Felt, Adrienne Porter, Song, Dawn
Android and Facebook provide third-party applications with access to users' private data and the ability to perform potentially sensitive operations (e.g., post to a user's wall or place phone calls). As a security measure, these platforms restrict applications' privileges with permission systems: users must approve the permissions requested by applications before the applications can make privacy- or security-relevant API calls. However, recent studies have shown that users often do not understand permission requests and lack a notion of typicality of requests. As a first step towards simplifying permission systems, we cluster a corpus of 188,389 Android applications and 27,029 Facebook applications to find patterns in permission requests. Using a method for Boolean matrix factorization for finding overlapping clusters, we find that Facebook permission requests follow a clear structure that exhibits high stability when fitted with only five clusters, whereas Android applications demonstrate more complex permission requests. We also find that low-reputation applications often deviate from the permission request patterns that we identified for high-reputation applications suggesting that permission request patterns are indicative for user satisfaction or application quality.
Distributed High Dimensional Information Theoretical Image Registration via Random Projections
Szabo, Zoltan, Lorincz, Andras
However, the estimation of these quantities is computationally intensive in high dimensions. On the other hand, consistent estimation from pairwise distances of the sample points is possible, which suits random projection(RP) based low dimensional embeddings. We adapt the RP technique to this task by means of a simple ensemble method. To the best of our knowledge, this is the first distributed, RP based information theoretical image registration approach. The efficiency of the method is demonstrated through numerical examples. Keywords: random projection, information theoretical image registration, high dimensional features, distributed solution 1. Introduction Machine learning methods are notoriously limited by the high dimensional nature of the data. This problem may be alleviated via the random projection (RP) technique, which has been successfully applied, e.g., in the fields of
Examples of Artificial Perceptions in Optical Character Recognition and Iris Recognition
Noaica, Cristina M., Badea, Robert, Motoc, Iulia M., Ghica, Claudiu G., Rosoiu, Alin C., Popescu-Bodorin, Nicolaie
This paper assumes the hypothesis that human learning is perception based, and consequently, the learning process and perceptions should not be represented and investigated independently or modeled in different simulation spaces. In order to keep the analogy between the artificial and human learning, the former is assumed here as being based on the artificial perception. Hence, instead of choosing to apply or develop a Computational Theory of (human) Perceptions, we choose to mirror the human perceptions in a numeric (computational) space as artificial perceptions and to analyze the interdependence between artificial learning and artificial perception in the same numeric space, using one of the simplest tools of Artificial Intelligence and Soft Computing, namely the perceptrons. As practical applications, we choose to work around two examples: Optical Character Recognition and Iris Recognition. In both cases a simple Turing test shows that artificial perceptions of the difference between two characters and between two irides are fuzzy, whereas the corresponding human perceptions are, in fact, crisp.
Towards Bridging the Gap Between Pattern Recognition and Symbolic Representation Within Neural Networks
Achler, Tsvi (Los Alamos National Labs)
Underlying symbolic representations are opaque within neural networks that perform pattern recognition. Neural network weights are sub-symbolic, they commonly do not have a direct symbolic correlates. This work shows that by implementing network dynamics differently, during the testing phase instead of the training phase, pattern recognition can be performed using symbolically relevant weights. This advancement is an important step towards the merging of neural-symbolic representation, memory, and reasoning with pattern recognition.
Modeling Images using Transformed Indian Buffet Processes
Zhai, Ke, Hu, Yuening, Williamson, Sinead, Boyd-Graber, Jordan
Latent feature models are attractive for image modeling, since images generally contain multiple objects. However, many latent feature models ignore that objects can appear at different locations or require pre-segmentation of images. While the transformed Indian buffet process (tIBP) provides a method for modeling transformation-invariant features in unsegmented binary images, its current form is inappropriate for real images because of its computational cost and modeling assumptions. We combine the tIBP with likelihoods appropriate for real images and develop an efficient inference, using the cross-correlation between images and features, that is theoretically and empirically faster than existing inference techniques. Our method discovers reasonable components and achieve effective image reconstruction in natural images.
Using Frequent Pattern Mining To Identify Behaviors In A Naked Mole Rat Colony
Imberman, Susan P. (College of Staten Island, Graduate Center, City University of New York) | Kress, Michael E. (College of Staten Island, Graduate Center, City University of New York) | McCloskey, Dan P. (College of Staten Island, CSI/IBR Center for Developmental Neuroscience)
Animal behavior analysis has, in the past, taken a very low tech approach, with direct observer surveillance and automated video surveillance as the norm. These methods are insufficient when one wants to study interactions between large numbers of animals in their housing environment. In this paper we use a housing environment that has been equipped with a system of RFID sensors. RFID transponders were implanted into the study animal, the naked mole rat. The resulting data was analyzed using principal component analysis and frequent pattern mining. Results showed that these methods can identify time periods of high behavioral activity from that of low activity, along with which groups of animals interacted with one another
Learning in Riemannian Orbifolds
Jain, Brijnesh J., Obermayer, Klaus
Statistical data analysis and learning in Riemannian orbifolds is motivated by applications, where the data we want to learn on are naturally represented by finite combinatorial structures such as point patterns, trees, and graphs. Examples from structural pattern recognition that learn on structured data include estimating central points of a distribution on graphs such as the mean and median [9, 16, 15, 21], central clustering of graphs [10, 12, 13, 14, 19, 15, 23], learning graph quantization [17], and multilayer perceptrons for graphs [20]. In retrospect, the structure space framework proposed by [18] theoretically justifies the above approaches in the sense that they actually minimize an empirical risk function on structures. Since minimizing an empirical risk function is usually computationally intractable, the ultimate challenge consists in constructing efficient algorithms which are capable to return optimal or at least suboptimal solutions. From the point of view of statistical pattern recognition, however, the ultimate goal is not to determine a good solution of an empirical risk function, but rather to discover the true but unknown structure of the data with respect to its distribution.
Mouse Simulation Using Two Coloured Tapes
Kumar, Vikram, Niyazi, Kamran, Mahe, Swapnil, Vyawahare, Swapnil
In this paper, we present a novel approach for Human Computer Interaction (HCI) where, we control cursor movement using a real-time camera. Current methods involve changing mouse parts such as adding more buttons or changing the position of the tracking ball. Instead, our method is to use a camera and computer vision technology, such as image segmentation and gesture recognition, to control mouse tasks (left and right clicking, double-clicking, and scrolling) and we show how it can perform everything as current mouse devices can. The software will be developed in JAVA language. Recognition and pose estimation in this system are user independent and robust as we will be using colour tapes on our finger to perform actions. The software can be used as an intuitive input interface to applications that require multi-dimensional control e.g. computer games etc.