Information Technology
Incorporating characteristics of human creativity into an evolutionary art algorithm
A perceived limitation of evolutionary art and design algorithms is that they rely on human intervention; the artist selects the most aesthetically pleasing variants of one generation to produce the next. This paper discusses how computer generated art and design can become more creatively human-like with respect to both process and outcome. As an example of a step in this direction, we present an algorithm that overcomes the above limitation by employing an automatic fitness function. The goal is to evolve abstract portraits of Darwin, using our 2nd generation fitness function which rewards genomes that not just produce a likeness of Darwin but exhibit certain strategies characteristic of human artists. We note that in human creativity, change is less choosing amongst randomly generated variants and more capitalizing on the associative structure of a conceptual network to hone in on a vision. We discuss how to achieve this fluidity algorithmically.
Decisional Processes with Boolean Neural Network: the Emergence of Mental Schemes
Barnabei, Graziano, Bagnoli, Franco, Conversano, Ciro, Lensi, Elena
Human decisional processes result from the employment of selected quantities of relevant information, generally synthesized from environmental incoming data and stored memories. Their main goal is the production of an appropriate and adaptive response to a cognitive or behavioral task. Different strategies of response production can be adopted, among which haphazard trials, formation of mental schemes and heuristics. In this paper, we propose a model of Boolean neural network that incorporates these strategies by recurring to global optimization strategies during the learning session. The model characterizes as well the passage from an unstructured/chaotic attractor neural network typical of data-driven processes to a faster one, forward-only and representative of schema-driven processes. Moreover, a simplified version of the Iowa Gambling Task (IGT) is introduced in order to test the model. Our results match with experimental data and point out some relevant knowledge coming from psychological domain.
An Empirical Evaluation of Four Algorithms for Multi-Class Classification: Mart, ABC-Mart, Robust LogitBoost, and ABC-LogitBoost
This empirical study is mainly devoted to comparing four tree-based boosting algorithms: mart, abc-mart, robust logitboost, and abc-logitboost, for multi-class classification on a variety of publicly available datasets. Some of those datasets have been thoroughly tested in prior studies using a broad range of classification algorithms including SVM, neural nets, and deep learning. In terms of the empirical classification errors, our experiment results demonstrate: 1. Abc-mart considerably improves mart. 2. Abc-logitboost considerably improves (robust) logitboost. 3. Robust) logitboost} considerably improves mart on most datasets. 4. Abc-logitboost considerably improves abc-mart on most datasets. 5. These four boosting algorithms (especially abc-logitboost) outperform SVM on many datasets. 6. Compared to the best deep learning methods, these four boosting algorithms (especially abc-logitboost) are competitive.
Abstract Answer Set Solvers with Learning
Nieuwenhuis, Oliveras, and Tinelli (2006) showed how to describe enhancements of the Davis-Putnam-Logemann-Loveland algorithm using transition systems, instead of pseudocode. We design a similar framework for several algorithms that generate answer sets for logic programs: Smodels, Smodels-cc, Asp-Sat with Learning (Cmodels), and a newly designed and implemented algorithm Sup. This approach to describing answer set solvers makes it easier to prove their correctness, to compare them, and to design new systems.
K-tree: Large Scale Document Clustering
De Vries, Christopher M., Geva, Shlomo
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Graph Quantization
Jain, Brijnesh J., Obermayer, Klaus
Vector quantization(VQ) is a lossy data compression technique from signal processing, which is restricted to feature vectors and therefore inapplicable for combinatorial structures. This contribution presents a theoretical foundation of graph quantization (GQ) that extends VQ to the domain of attributed graphs. We present the necessary Lloyd-Max conditions for optimality of a graph quantizer and consistency results for optimal GQ design based on empirical distortion measures and stochastic optimization. These results statistically justify existing clustering algorithms in the domain of graphs. The proposed approach provides a template of how to link structural pattern recognition methods other than GQ to statistical pattern recognition.
Document Clustering with K-tree
De Vries, Christopher M., Geva, Shlomo
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
The ILIUM forward modelling algorithm for multivariate parameter estimation and its application to derive stellar parameters from Gaia spectrophotometry
I introduce an algorithm for estimating parameters from multidimensional data based on forward modelling. In contrast to many machine learning approaches it avoids fitting an inverse model and the problems associated with this. The algorithm makes explicit use of the sensitivities of the data to the parameters, with the goal of better treating parameters which only have a weak impact on the data. The forward modelling approach provides uncertainty (full covariance) estimates in the predicted parameters as well as a goodness-of-fit for observations. I demonstrate the algorithm, ILIUM, with the estimation of stellar astrophysical parameters (APs) from simulations of the low resolution spectrophotometry to be obtained by Gaia. The AP accuracy is competitive with that obtained by a support vector machine. For example, for zero extinction stars covering a wide range of metallicity, surface gravity and temperature, ILIUM can estimate Teff to an accuracy of 0.3% at G=15 and to 4% for (lower signal-to-noise ratio) spectra at G=20. [Fe/H] and logg can be estimated to accuracies of 0.1-0.4dex for stars with G<=18.5. If extinction varies a priori over a wide range (Av=0-10mag), then Teff and Av can be estimated quite accurately (3-4% and 0.1-0.2mag respectively at G=15), but there is a strong and ubiquitous degeneracy in these parameters which limits our ability to estimate either accurately at faint magnitudes. Using the forward model we can map these degeneracies (in advance), and thus provide a complete probability distribution over solutions. (Abridged)
Bayesian orthogonal component analysis for sparse representation
Dobigeon, Nicolas, Tourneret, Jean-Yves
This paper addresses the problem of identifying a lower dimensional space where observed data can be sparsely represented. This under-complete dictionary learning task can be formulated as a blind separation problem of sparse sources linearly mixed with an unknown orthogonal mixing matrix. This issue is formulated in a Bayesian framework. First, the unknown sparse sources are modeled as Bernoulli-Gaussian processes. To promote sparsity, a weighted mixture of an atom at zero and a Gaussian distribution is proposed as prior distribution for the unobserved sources. A non-informative prior distribution defined on an appropriate Stiefel manifold is elected for the mixing matrix. The Bayesian inference on the unknown parameters is conducted using a Markov chain Monte Carlo (MCMC) method. A partially collapsed Gibbs sampler is designed to generate samples asymptotically distributed according to the joint posterior distribution of the unknown model parameters and hyperparameters. These samples are then used to approximate the joint maximum a posteriori estimator of the sources and mixing matrix. Simulations conducted on synthetic data are reported to illustrate the performance of the method for recovering sparse representations. An application to sparse coding on under-complete dictionary is finally investigated.
Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures
Tan, Vincent Y. F., Anandkumar, Animashree, Willsky, Alan S.
The problem of learning tree-structured Gaussian graphical models from independent and identically distributed (i.i.d.) samples is considered. The influence of the tree structure and the parameters of the Gaussian distribution on the learning rate as the number of samples increases is discussed. Specifically, the error exponent corresponding to the event that the estimated tree structure differs from the actual unknown tree structure of the distribution is analyzed. Finding the error exponent reduces to a least-squares problem in the very noisy learning regime. In this regime, it is shown that the extremal tree structure that minimizes the error exponent is the star for any fixed set of correlation coefficients on the edges of the tree. If the magnitudes of all the correlation coefficients are less than 0.63, it is also shown that the tree structure that maximizes the error exponent is the Markov chain. In other words, the star and the chain graphs represent the hardest and the easiest structures to learn in the class of tree-structured Gaussian graphical models. This result can also be intuitively explained by correlation decay: pairs of nodes which are far apart, in terms of graph distance, are unlikely to be mistaken as edges by the maximum-likelihood estimator in the asymptotic regime.