Europe
Algebraic Comparison of Partial Lists in Bioinformatics
Jurman, Giuseppe, Riccadonna, Samantha, Visintainer, Roberto, Furlanello, Cesare
The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset.
On Tsallis Entropy Bias and Generalized Maximum Entropy Models
Hou, Yuexian, Yan, Tingxu, Zhang, Peng, Song, Dawei, Li, Wenjie
In density estimation task, maximum entropy model (Maxent) can effectively use reliable prior information via certain constraints, i.e., linear constraints without empirical parameters. However, reliable prior information is often insufficient, and the selection of uncertain constraints becomes necessary but poses considerable implementation complexity. Improper setting of uncertain constraints can result in overfitting or underfitting. To solve this problem, a generalization of Maxent, under Tsallis entropy framework, is proposed. The proposed method introduces a convex quadratic constraint for the correction of (expected) Tsallis entropy bias (TEB). Specifically, we demonstrate that the expected Tsallis entropy of sampling distributions is smaller than the Tsallis entropy of the underlying real distribution. This expected entropy reduction is exactly the (expected) TEB, which can be expressed by a closed-form formula and act as a consistent and unbiased correction. TEB indicates that the entropy of a specific sampling distribution should be increased accordingly. This entails a quantitative re-interpretation of the Maxent principle. By compensating TEB and meanwhile forcing the resulting distribution to be close to the sampling distribution, our generalized TEBC Maxent can be expected to alleviate the overfitting and underfitting. We also present a connection between TEB and Lidstone estimator. As a result, TEB-Lidstone estimator is developed by analytically identifying the rate of probability correction in Lidstone. Extensive empirical evaluation shows promising performance of both TEBC Maxent and TEB-Lidstone in comparison with various state-of-the-art density estimation methods.
Ontology-supported processing of clinical text using medical knowledge integration for multi-label classification of diagnosis coding
Waraporn, Phanu, Meesad, Phayung, Clayton, Gareth
Abstract--This paper discusses the knowledge integration of clinical information extracted from distributed medical ontology in order to ameliorate a machine learning-based multi-label coding assignment system. The proposed approach is implemented using a decision tree based cascade hierarchical technique on the university hospital data for patients with Coronary Heart Disease (CHD). The preliminary results obtained show a satisfactory finding. An ontology is a specification of a conceptualization that defines and/or specifies the concepts, relationships, and other distinctions that are relevant for modeling a domain. Such specification takes the form of the definitions of representational vocabulary (classes, relations, and so on), which provide meanings to the vocabulary and formal constraints on its coherent use [3].
Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation
Hébrail, Georges, Hugueney, Bernard, Lechevallier, Yves, Rossi, Fabrice
We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into $K$ clusters and represents each cluster by a simple prototype (e.g., piecewise constant). The total number of segments in the prototypes, $P$, is chosen by the user and optimally distributed among the clusters via two dynamic programming algorithms. The practical relevance of the method is shown on two real world datasets.
On the Schoenberg Transformations in Data Analysis: Theory and Illustrations
The class of Schoenberg transformations, embedding Euclidean distances into higher dimensional Euclidean spaces, is presented, and derived from theorems on positive definite and conditionally negative definite matrices. Original results on the arc lengths, angles and curvature of the transformations are proposed, and visualized on artificial data sets by classical multidimensional scaling. A simple distance-based discriminant algorithm illustrates the theory, intimately connected to the Gaussian kernels of Machine Learning.
Multiattribute Auctions Based on Generalized Additive Independence
We develop multiattribute auctions that accommodate generalized additive independent (GAI) preferences. We propose an iterative auction mechanism that maintains prices on potentially overlapping GAI clusters of attributes, thus decreases elicitation and computational burden, and creates an open competition among suppliers over a multidimensional domain. Most significantly, the auction is guaranteed to achieve surplus which approximates optimal welfare up to a small additive factor, under reasonable equilibrium strategies of traders. The main departure of GAI auctions from previous literature is to accommodate non-additive trader preferences, hence allowing traders to condition their evaluation of specific attributes on the value of other attributes. At the same time, the GAI structure supports a compact representation of prices, enabling a tractable auction process. We perform a simulation study, demonstrating and quantifying the significant efficiency advantage of more expressive preference modeling. We draw random GAI-structured utility functions with various internal structures, generate additive functions that approximate the GAI utility, and compare the performance of the auctions using the two representations. We find that allowing traders to express existing dependencies among attributes improves the economic efficiency of multiattribute auctions.
Reasoning About the Transfer of Control
van der Hoek, W., Walther, D., Wooldridge, M.
We present DCL-PC: a logic for reasoning about how the abilities of agents and coalitions of agents are altered by transferring control from one agent to another. The logical foundation of DCL-PC is CL-PC, a logic for reasoning about cooperation in which the abilities of agents and coalitions of agents stem from a distribution of atomic Boolean variables to individual agents -- the choices available to a coalition correspond to assignments to the variables the coalition controls. The basic modal constructs of DCL-PC are of the form `coalition C can cooperate to bring about phi'. DCL-PC extends CL-PC with dynamic logic modalities in which atomic programs are of the form `agent i gives control of variable p to agent j'; as usual in dynamic logic, these atomic programs may be combined using sequence, iteration, choice, and test operators to form complex programs. By combining such dynamic transfer programs with cooperation modalities, it becomes possible to reason about how the power of agents and coalitions is affected by the transfer of control. We give two alternative semantics for the logic: a `direct' semantics, in which we capture the distributions of Boolean variables to agents; and a more conventional Kripke semantics. We prove that these semantics are equivalent, and then present an axiomatization for the logic. We investigate the computational complexity of model checking and satisfiability for DCL-PC, and show that both problems are PSPACE-complete (and hence no worse than the underlying logic CL-PC). Finally, we investigate the characterisation of control in DCL-PC. We distinguish between first-order control -- the ability of an agent or coalition to control some state of affairs through the assignment of values to the variables under the control of the agent or coalition -- and second-order control -- the ability of an agent to exert control over the control that other agents have by transferring variables to other agents. We give a logical characterisation of second-order control.
Training a Multilingual Sportscaster: Using Perceptual Context to Learn Language
Chen, D. L., Kim, J., Mooney, R. J.
We present a novel framework for learning to interpret and generate language using only perceptual context as supervision. We demonstrate its capabilities by developing a system that learns to sportscast simulated robot soccer games in both English and Korean without any language-specific prior knowledge. Training employs only ambiguous supervision consisting of a stream of descriptive textual comments and a sequence of events extracted from the simulation trace. The system simultaneously establishes correspondences between individual comments and the events that they describe while building a translation model that supports both parsing and generation. We also present a novel algorithm for learning which events are worth describing. Human evaluations of the generated commentaries indicate they are of reasonable quality and in some cases even on par with those produced by humans for our limited domain.
An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs
Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs). Although DEC-POMDPS are a general and powerful modeling tool, solving them is a task with an overwhelming complexity that can be doubly exponential. In this paper, we study an alternate formulation of DEC-POMDPs relying on a sequence-form representation of policies. From this formulation, we show how to derive Mixed Integer Linear Programming (MILP) problems that, once solved, give exact optimal solutions to the DEC-POMDPs. We show that these MILPs can be derived either by using some combinatorial characteristics of the optimal solutions of the DEC-POMDPs or by using concepts borrowed from game theory. Through an experimental validation on classical test problems from the DEC-POMDP literature, we compare our approach to existing algorithms. Results show that mathematical programming outperforms dynamic programming but is less efficient than forward search, except for some particular problems. The main contributions of this work are the use of mathematical programming for DEC-POMDPs and a better understanding of DEC-POMDPs and of their solutions. Besides, we argue that our alternate representation of DEC-POMDPs could be helpful for designing novel algorithms looking for approximate solutions to DEC-POMDPs.
Conceptual Ternary Diagrams for Shape Perception: A Preliminary Step
Rudduck, Sylvan Grenfell (University of Technology, Sydney) | Williams, Mary-Anne (University of Technology, Sydney)
This work-in-progress provides a preliminary cognitive investigation of how the external visualization of the Ternary diagram (TD) might be used as an underlying model for exploring the representation of simple 3D cuboids according to the theory of Conceptual Spaces. Gärdenfors introduced geometrical entities, known as conceptual spaces, for modeling concepts. He considered multidimensional spaces equipped with a range of similarity measures (such as metrics) and guided by criteria and mechanisms as a geometrical model for concept formation and management. Our work is inspired by the conceptual spaces approach and takes ternary diagrams as its underlying conceptual model. The main motivation for our work is twofold. First, Ternary Diagrams are powerful conceptual representations that have a solid historical and mathematical foundation. Second, the notion of overlaying an Information- Entropy function on a ternary diagram can lead to new insights into applications of reasoning about shape and other cognitive processes.