Genre
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes
I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful. Curiosity is the desire to create or discover more non-random, non-arbitrary, regular data that is novel and surprising not in the traditional sense of Boltzmann and Shannon but in the sense that it allows for compression progress because its regularity was not yet known. This drive maximizes interestingness, the first derivative of subjective beauty or compressibility, that is, the steepness of the learning curve. It motivates exploring infants, pure mathematicians, composers, artists, dancers, comedians, yourself, and (since 1990) artificial systems.
An Investigation Report on Auction Mechanism Design
Auctions are markets with strict regulations governing the information available to traders in the market and the possible actions they can take. Since well designed auctions achieve desirable economic outcomes, they have been widely used in solving real-world optimization problems, and in structuring stock or futures exchanges. Auctions also provide a very valuable testing-ground for economic theory, and they play an important role in computer-based control systems. Auction mechanism design aims to manipulate the rules of an auction in order to achieve specific goals. Economists traditionally use mathematical methods, mainly game theory, to analyze auctions and design new auction forms. However, due to the high complexity of auctions, the mathematical models are typically simplified to obtain results, and this makes it difficult to apply results derived from such models to market environments in the real world. As a result, researchers are turning to empirical approaches. This report aims to survey the theoretical and empirical approaches to designing auction mechanisms and trading strategies with more weights on empirical ones, and build the foundation for further research in the field.
KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis
Griffith, Obi L., Gao, Byron J., Bilenky, Mikhail, Prichyna, Yuliya, Ester, Martin, Jones, Steven J. M.
Numerous studies have used coexpression of large expression datasets to infer functional associations between genes [1], to identify groups of related genes that are important in specific cancers or represent common tumour progression mechanisms [2], to study evolutionary change [3], for integration with other large-scale datasets [4][5], [6], and for the generation of high-quality biological interaction networks [7][8][9] [10]. A number of studies have also attempted to use coexpression to identify coregulation with the hypothesis that if two or more genes are expressed at the same time and location and at similar levels then they may be regulated by the same transcription factors and regulatory elements. This approach has shown promise particularly in simpler model organisms such as A. thaliana and S. cerevisiae [11] [12][13] [14] and many groups are currently working on implementing this idea in mammalian systems. However, traditional clustering methods have not worked particularly well on large datasets for this problem. Most methods assign each gene to only one cluster while in reality many genes likely take part in multiple processes. Also, global coexpression is measured across all conditions, whereas, it is probable that most genes are only tightly coregulated under certain conditions or locations. In recent years, a new field of clustering analysis termed subspace clustering (or biclustering) has gained increasing popularity in the analysis of gene expression data and other biological data [15][16][17][18] [19]. In contrast to traditional clustering methods such as hierarchical clustering, subspace clustering methods do not require expression to be correlated across all conditions for genes to be assigned to the same cluster. This has several advantages for data in which biologically relevant subsets exist (e.g.
A Fast Algorithm and Datalog Inexpressibility for Temporal Reasoning
We introduce a new tractable temporal constraint language, which strictly contains the Ord-Horn language of Buerkert and Nebel and the class of AND/OR precedence constraints. The algorithm we present for this language decides whether a given set of constraints is consistent in time that is quadratic in the input size. We also prove that (unlike Ord-Horn) this language cannot be solved by Datalog or by establishing local consistency.
The Redundancy of a Computable Code on a Noncomputable Distribution
We introduce new definitions of universal and superuniversal computable codes, which are based on a code's ability to approximate Kolmogorov complexity within the prescribed margin for all individual sequences from a given set. Such sets of sequences may be singled out almost surely with respect to certain probability measures. Consider a measure parameterized with a real parameter and put an arbitrary prior on the parameter. The Bayesian measure is the expectation of the parameterized measure with respect to the prior. It appears that a modified Shannon-Fano code for any computable Bayesian measure, which we call the Bayesian code, is superuniversal on a set of parameterized measure-almost all sequences for prior-almost every parameter. According to this result, in the typical setting of mathematical statistics no computable code enjoys redundancy which is ultimately much less than that of the Bayesian code. Thus we introduce another characteristic of computable codes: The catch-up time is the length of data for which the code length drops below the Kolmogorov complexity plus the prescribed margin. Some codes may have smaller catch-up times than Bayesian codes.
CP-logic: A Language of Causal Probabilistic Events and Its Relation to Logic Programming
Vennekens, Joost, Denecker, Marc, Bruynooghe, Maurice
This papers develops a logical language for representing probabilistic causal laws. Our interest in such a language is twofold. First, it can be motivated as a fundamental study of the representation of causal knowledge. Causality has an inherent dynamic aspect, which has been studied at the semantical level by Shafer in his framework of probability trees. In such a dynamic context, where the evolution of a domain over time is considered, the idea of a causal law as something which guides this evolution is quite natural. In our formalization, a set of probabilistic causal laws can be used to represent a class of probability trees in a concise, flexible and modular way. In this way, our work extends Shafer's by offering a convenient logical representation for his semantical objects. Second, this language also has relevance for the area of probabilistic logic programming. In particular, we prove that the formal semantics of a theory in our language can be equivalently defined as a probability distribution over the well-founded models of certain logic programs, rendering it formally quite similar to existing languages such as ICL or PRISM. Because we can motivate and explain our language in a completely self-contained way as a representation of probabilistic causal laws, this provides a new way of explaining the intuitions behind such probabilistic logic programs: we can say precisely which knowledge such a program expresses, in terms that are equally understandable by a non-logician. Moreover, we also obtain an additional piece of knowledge representation methodology for probabilistic logic programs, by showing how they can express probabilistic causal laws.
Intent expression using eye robot for mascot robot system
Yamazaki, Yoichi, Dong, Fangyan, Masuda, Yuta, Uehara, Yukiko, Kormushev, Petar, Vu, Hai An, Le, Phuc Quang, Hirota, Kaoru
An intent expression system using eye robots is proposed for a mascot robot system from a viewpoint of humatronics. The eye robot aims at providing a basic interface method for an information terminal robot system. To achieve better understanding of the displayed information, the importance and the degree of certainty of the information should be communicated along with the main content. The proposed intent expression system aims at conveying this additional information using the eye robot system. Eye motions are represented as the states in a pleasure-arousal space model. Changes in the model state are calculated by fuzzy inference according to the importance and degree of certainty of the displayed information. These changes influence the arousal-sleep coordinates in the space that corresponds to levels of liveliness during communication. The eye robot provides a basic interface for the mascot robot system that is easy to be understood as an information terminal for home environments in a humatronics society.
Fuzzy inference based mentality estimation for eye robot agent
Yamazaki, Yoichi, Dong, Fangyan, Masuda, Yuta, Uehara, Yukiko, Kormushev, Petar, Vu, Hai An, Le, Phuc Quang, Hirota, Kaoru
Household robots need to communicate with human beings in a friendly fashion. To achieve better understanding of displayed information, an importance and a certainty of the information should be communicated together with the main information. The proposed intent expression system aims to convey this additional information using an eye robot. The eye motions are represented as states in a pleasure-arousal space model. Change of the model state is calculated by fuzzy inference according to the importance and certainty of the displayed information. This change influences the arousal-sleep coordinate in the space which corresponds to activeness in communication. The eye robot provides a basic interface for the mascot robot system which is an easy to understand information terminal for home environments in a humatronics society.
Online prediction of ovarian cancer
Zhdanov, Fedor, Vovk, Vladimir, Burford, Brian, Devetyarov, Dmitry, Nouretdinov, Ilia, Gammerman, Alex
In this paper we apply computer learning methods to diagnosing ovarian cancer using the level of the standard biomarker CA125 in conjunction with information provided by mass-spectrometry. We are working with a new data set collected over a period of 7 years. Using the level of CA125 and mass-spectrometry peaks, our algorithm gives probability predictions for the disease. To estimate classification accuracy we convert probability predictions into strict predictions. Our algorithm makes fewer errors than almost any linear combination of the CA125 level and one peak's intensity (taken on the log scale). To check the power of our algorithm we use it to test the hypothesis that CA125 and the peaks do not contain useful information for the prediction of the disease at a particular time before the diagnosis. Our algorithm produces $p$-values that are better than those produced by the algorithm that has been previously applied to this data set. Our conclusion is that the proposed algorithm is more reliable for prediction on new data.
Bayesian MAP Model Selection of Chain Event Graphs
The class of chain event graph models is a generalisation of the class of discrete Bayesian networks, retaining most of the structural advantages of the Bayesian network for model interrogation, propagation and learning, while more naturally encoding asymmetric state spaces and the order in which events happen. In this paper we demonstrate how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and prove that suitable homogeneity assumptions characterise the product Dirichlet prior on this class of models. We demonstrate our techniques using two educational examples.