Goto

Collaborating Authors

 Genre


Embedding agents in business applications using enterprise integration patterns

arXiv.org Artificial Intelligence

This paper addresses the issue of integrating agents with a variety of external resources and services, as found in enterprise computing environments. We propose an approach for interfacing agents and existing message routing and mediation engines based on the endpoint concept from the enterprise integration patterns of Hohpe and Woolf. A design for agent endpoints is presented, and an architecture for connecting the Jason agent platform to the Apache Camel enterprise integration framework using this type of endpoint is described. The approach is illustrated by means of a business process use case, and a number of Camel routes are presented. These demonstrate the benefits of interfacing agents to external services via a specialised message routing tool that supports enterprise integration patterns.


Feature Selection for Microarray Gene Expression Data using Simulated Annealing guided by the Multivariate Joint Entropy

arXiv.org Machine Learning

In cancer diagnosis, classification of the different tumor types is of great importance. An accurate prediction of different tumor types provides better treatment and toxicity minimization on patients. Traditional methods of tackling this situation are primarily based on morphological characteristics of tumorous tissue [1]. These conventional methods are reported to have several diagnosis limitations. In order to analyze the problem of cancer classification using gene expression data, more systematic approaches have been developed [2]. Pioneering work in cancer classification by gene expression using DNA microarray showed the possibility to help the diagnosis by means of Machine Learning or more generally Data Mining methods [3], which are now extensively used for this task [4]. However, in this setting gene expression data analysis entails a heavy computational consumption of resources, due to the extreme sparseness compared to standard data sets in classification tasks [5]. Typically, a gene expression data set may consist of dozens of observations but with thousands or even tens of thousands of genes.


Possible and Necessary Winner Problem in Social Polls

arXiv.org Artificial Intelligence

Social networks are increasingly being used to conduct polls. We introduce a simple model of such social polling. We suppose agents vote sequentially, but the order in which agents choose to vote is not necessarily fixed. We also suppose that an agent's vote is influenced by the votes of their friends who have already voted. Despite its simplicity, this model provides useful insights into a number of areas including social polling, sequential voting, and manipulation. We prove that the number of candidates and the network structure affect the computational complexity of computing which candidate necessarily or possibly can win in such a social poll. For social networks with bounded treewidth and a bounded number of candidates, we provide polynomial algorithms for both problems. In other cases, we prove that computing which candidates necessarily or possibly win are computationally intractable.


Constraint Propagation as Information Maximization

arXiv.org Artificial Intelligence

This paper draws on diverse areas of computer science to develop a unified view of computation: - Optimization in operations research, where a numerical objective function is maximized under constraints, is generalized from the numerical total order to a non-numerical partial order that can be interpreted in terms of information. The distinction is essential in our definition of constraint satisfaction problems. As application we treat constraint satisfaction problems over reals. The chaotic algorithm analyzed in the paper combines the efficiency of floating-point computation with the correctness guarantees of arising from our logico-mathematical model of constraint-satisfaction problems. The early history of constraint processing is written in three MIT theses: Sutherland's, Waltz's, and Steele's [16, 20, 14]. Already in this small selection one can discern two radically different approaches. Sutherland and Steele use relaxation: starting form a guessed assignment of values to variables, constraints are successively used to adjust variables in such a way as to satisfy better the constraint under consideration. These authors followed an old idea brought into prominence under the name of relaxation by Southwell [15]. He associated with each of the problem's variables a domain; that is, the set of all values that are not a priori impossible. Each constraint is then used to eliminate values from the domains of one or more variables affected by the constraint that are incompatible with that constraint. In this paper we are concerned with the latter method, which we call the domain reduction method.


Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks

arXiv.org Artificial Intelligence

Deep Max-Pooling Convolutional Neural Networks are Deep Neural Networks (DNN) with convolutional and max-pooling layers. Convolutional Neural Networks (CNN) can be traced back to the Neocognitron [1] in 1980. They were first successfully applied to relatively small tasks such as digit recognition [2], image interpretation [3] and object recognition [4]. Back then their size was greatly limited by the low computational power of available hardware. Since 2010, however, DNN have greatly profited from Graphics Processing Units (GPU). Simple GPU-based multilayer perceptrons (MLP) establised new state of the art results [5] on the MNIST handwritten digit dataset [4] when made both deep and large (augmenting the training set by artificial samples helped to avoid overfitting).


Probabilistic Acceptance

arXiv.org Artificial Intelligence

The idea of fully accepting statements when the evidence has rendered them probable enough faces a number of difficulties. We leave the interpretation of probability largely open, but attempt to suggest a contextual approach to full belief. We show that the difficulties of probabilistic acceptance are not as severe as they are sometimes painted, and that though there are oddities associated with probabilistic acceptance they are in some instances less awkward than the difficulties associated with other nonmonotonic formalisms. We show that the structure at which we arrive provides a natural home for statistical inference.


Support and Plausibility Degrees in Generalized Functional Models

arXiv.org Artificial Intelligence

By discussing several examples, the theory of generalized functional models is shown to be very natural for modeling some situations of reasoning under uncertainty. A generalized functional model is a pair (f, P) where f is a function describing the interactions between a parameter variable, an observation variable and a random source, and P is a probability distribution for the random source. Unlike traditional functional models, generalized functional models do not require that there is only one value of the parameter variable that is compatible with an observation and a realization of the random source. As a consequence, the results of the analysis of a generalized functional model are not expressed in terms of probability distributions but rather by support and plausibility functions. The analysis of a generalized functional model is very logical and is inspired from ideas already put forward by R.A. Fisher in his theory of fiducial probability.


An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering

arXiv.org Machine Learning

Assignment methods are at the heart of many algorithms for unsupervised learning and clustering - in particular, the well-known K-means and Expectation-Maximization (EM) algorithms. In this work, we study several different methods of assignment, including the "hard" assignments used by K-means and the ?soft' assignments used by EM. While it is known that K-means minimizes the distortion on the data and EM maximizes the likelihood, little is known about the systematic differences of behavior between the two algorithms. Here we shed light on these differences via an information-theoretic analysis. The cornerstone of our results is a simple decomposition of the expected distortion, showing that K-means (and its extension for inferring general parametric densities from unlabeled sample data) must implicitly manage a trade-off between how similar the data assigned to each cluster are, and how the data are balanced among the clusters. How well the data are balanced is measured by the entropy of the partition defined by the hard assignments. In addition to letting us predict and verify systematic differences between K-means and EM on specific examples, the decomposition allows us to give a rather general argument showing that K ?means will consistently find densities with less "overlap" than EM. We also study a third natural assignment method that we call posterior assignment, that is close in spirit to the soft assignments of EM, but leads to a surprisingly different algorithm.


Models and Selection Criteria for Regression and Classification

arXiv.org Machine Learning

When performing regression or classification, we are interested in the conditional probability distribution for an outcome or class variable Y given a set of explanatoryor input variables X. We consider Bayesian models for this task. In particular, we examine a special class of models, which we call Bayesian regression/classification (BRC) models, that can be factored into independent conditional (y|x) and input (x) models. These models are convenient, because the conditional model (the portion of the full model that we care about) can be analyzed by itself. We examine the practice of transforming arbitrary Bayesian models to BRC models, and argue that this practice is often inappropriate because it ignores prior knowledge that may be important for learning. In addition, we examine Bayesian methods for learning models from data. We discuss two criteria for Bayesian model selection that are appropriate for repression/classification: one described by Spiegelhalter et al. (1993), and another by Buntine (1993). We contrast these two criteria using the prequential framework of Dawid (1984), and give sufficient conditions under which the criteria agree.


Update Rules for Parameter Estimation in Bayesian Networks

arXiv.org Machine Learning

This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [Kivinen & Warmuth, 1994]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process. In the batch case, our framework encompasses both the gradient projection algorithm and the EM algorithm for Bayesian networks. The framework also leads to new on-line and batch parameter update schemes, including a parameterized version of EM. We provide both empirical and theoretical results indicating that parameterized EM allows faster convergence to the maximum likelihood parameters than does standard EM.