Genre
Partial Gaussian Graphical Model Estimation
For such Gaussian graphical models (GGMs), it is usually assumed that a given variable can bepredicted by a small numberof other variables. This assumption implies that the precision matrix is sparse. Therefore estimating Gaussian graphical model can be reduced to the problem of estimating a sparse precision matrix. One approach to sparse precision matrix estimation is covariance selection or neighborhood selection (Dempster, 1972; Meinshausen & Bรผhlmann, 2006), which tries to estimate each row (or column) of the precision matrix by predicting the corresponding variable using a sparse linear combination of other variables. An alternative formulation is maximum-likelihood estimation method that directly estimate the full precision matrix.
Improving accuracy and power with transfer learning using a meta-analytic database
Schwartz, Yannick, Varoquaux, Gaรซl, Pallier, Christophe, Pinel, Philippe, Poline, Jean-Baptiste, Thirion, Bertrand
Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the basis of the coordinates of previously detected effects. In this paper, we propose to use a database of images, rather than coordinates, and frame the problem as transfer learning: learning a discriminant model on a reference task to apply it to a different but related new task. To facilitate statistical analysis of small cohorts, we use a sparse discriminant model that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs. The benefits of our approach are twofold. First it uses the reference database for prediction, i.e. to provide potential biomarkers in a clinical setting. Second it increases statistical power on the new task. We demonstrate on a set of 18 pairs of functional MRI experimental conditions that our approach gives good prediction. In addition, on a specific transfer situation involving different scanners at different locations, we show that voxel selection based on transfer learning leads to higher detection power on small cohorts.
Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping
We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at multiple granularity, and we seek to leverage this structure to recover covariates relevant to each hierarchically-defined cluster of responses. We propose a tree-guided group lasso, or tree lasso, for estimating such structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree. We describe a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlaps among groups. For efficient optimization, we employ a smoothing proximal gradient method that was originally developed for a general class of structured-sparsity-inducing penalties. Using simulated and yeast data sets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns, compared to other methods for learning a multivariate-response regression.
Geometric lattice structure of covering-based rough sets through matroids
Covering-based rough set theory is a useful tool to deal with inexact, uncertain or vague knowledge in information systems. Geometric lattice has widely used in diverse fields, especially search algorithm design which plays important role in covering reductions. In this paper, we construct four geometric lattice structures of covering-based rough sets through matroids, and compare their relationships. First, a geometric lattice structure of covering-based rough sets is established through the transversal matroid induced by the covering, and its characteristics including atoms, modular elements and modular pairs are studied. We also construct a one-to-one correspondence between this type of geometric lattices and transversal matroids in the context of covering-based rough sets. Second, sufficient and necessary conditions for three types of covering upper approximation operators to be closure operators of matroids are presented. We exhibit three types of matroids through closure axioms, and then obtain three geometric lattice structures of covering-based rough sets. Third, these four geometric lattice structures are compared. Some core concepts such as reducible elements in covering-based rough sets are investigated with geometric lattices. In a word, this work points out an interesting view, namely geometric lattice, to study covering-based rough sets.
Topological characterizations to three types of covering approximation operators
Covering-based rough set theory is a useful tool to deal with inexact, uncertain or vague knowledge in information systems. Topology, one of the most important subjects in mathematics, provides mathematical tools and interesting topics in studying information systems and rough sets. In this paper, we present the topological characterizations to three types of covering approximation operators. First, we study the properties of topology induced by the sixth type of covering lower approximation operator. Second, some topological characterizations to the covering lower approximation operator to be an interior operator are established. We find that the topologies induced by this operator and by the sixth type of covering lower approximation operator are the same. Third, we study the conditions which make the first type of covering upper approximation operator be a closure operator, and find that the topology induced by the operator is the same as the topology induced by the fifth type of covering upper approximation operator. Forth, the conditions of the second type of covering upper approximation operator to be a closure operator and the properties of topology induced by it are established. Finally, these three topologies space are compared. In a word, topology provides a useful method to study the covering-based rough sets.
Examples of Artificial Perceptions in Optical Character Recognition and Iris Recognition
Noaica, Cristina M., Badea, Robert, Motoc, Iulia M., Ghica, Claudiu G., Rosoiu, Alin C., Popescu-Bodorin, Nicolaie
This paper assumes the hypothesis that human learning is perception based, and consequently, the learning process and perceptions should not be represented and investigated independently or modeled in different simulation spaces. In order to keep the analogy between the artificial and human learning, the former is assumed here as being based on the artificial perception. Hence, instead of choosing to apply or develop a Computational Theory of (human) Perceptions, we choose to mirror the human perceptions in a numeric (computational) space as artificial perceptions and to analyze the interdependence between artificial learning and artificial perception in the same numeric space, using one of the simplest tools of Artificial Intelligence and Soft Computing, namely the perceptrons. As practical applications, we choose to work around two examples: Optical Character Recognition and Iris Recognition. In both cases a simple Turing test shows that artificial perceptions of the difference between two characters and between two irides are fuzzy, whereas the corresponding human perceptions are, in fact, crisp.
Sparse Ising Models with Covariates
Cheng, Jie, Levina, Elizaveta, Wang, Pei, Zhu, Ji
There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we propose a sparse covariate dependent Ising model to study both the conditional dependency within the binary data and its relationship with the additional covariates. This results in subject-specific Ising models, where the subject's covariates influence the strength of association between the genes. As in all exploratory data analysis, interpretability of results is important, and we use L1 penalties to induce sparsity in the fitted graphs and in the number of selected covariates. Two algorithms to fit the model are proposed and compared on a set of simulated data, and asymptotic results are established. The results on the tumor dataset and their biological significance are discussed in detail.
Mirror Descent Meets Fixed Share (and feels no regret)
Cesa-Bianchi, Nicolรฒ, Gaillard, Pierre, Lugosi, Gabor, Stoltz, Gilles
Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other related regrets. Our analysis also captures and extends the generalized weight sharing technique of Bousquet and Warmuth, and can be refined in several ways, including improvements for small losses and adaptive tuning of parameters.
Multi-Agents Dynamic Case Based Reasoning and The Inverse Longest Common Sub-Sequence And Individualized Follow-up of Learners in The CEHL
Zouhair, Abdelhamid, En-Naimi, El Mokhtar, Amami, Benaissa, Boukachour, Hadhoum, Person, Patrick, Bertelle, Cyrille
In E-learning, there is still the problem of knowing how to ensure an individualized and continuous learner's follow-up during learning process, indeed among the numerous tools proposed, very few systems concentrate on a real time learner's follow-up. Our work in this field develops the design and implementation of a Multi-Agents System Based on Dynamic Case Based Reasoning which can initiate learning and provide an individualized follow-up of learner. When interacting with the platform, every learner leaves his/her traces in the machine. These traces are stored in a basis under the form of scenarios which enrich collective past experience. The system monitors, compares and analyses these traces to keep a constant intelligent watch and therefore detect difficulties hindering progress and/or avoid possible dropping out. The system can support any learning subject. The success of a case-based reasoning system depends critically on the performance of the retrieval step used and, more specifically, on similarity measure used to retrieve scenarios that are similar to the course of the learner (traces in progress). We propose a complementary similarity measure, named Inverse Longest Common Sub-Sequence (ILCSS). To help and guide the learner, the system is equipped with combined virtual and human tutors.
Towards Unsupervised Learning of Temporal Relations between Events
Mirroshandel, S.A., Ghassem-Sani, G.
Automatic extraction of temporal relations between event pairs is an important task for several natural language processing applications such as Question Answering, Information Extraction, and Summarization. Since most existing methods are supervised and require large corpora, which for many languages do not exist, we have concentrated our efforts to reduce the need for annotated data as much as possible. This paper presents two different algorithms towards this goal. The first algorithm is a weakly supervised machine learning approach for classification of temporal relations between events. In the first stage, the algorithm learns a general classifier from an annotated corpus. Then, inspired by the hypothesis of "one type of temporal relation per discourse'', it extracts useful information from a cluster of topically related documents. We show that by combining the global information of such a cluster with local decisions of a general classifier, a bootstrapping cross-document classifier can be built to extract temporal relations between events. Our experiments show that without any additional annotated data, the accuracy of the proposed algorithm is higher than that of several previous successful systems. The second proposed method for temporal relation extraction is based on the expectation maximization (EM) algorithm. Within EM, we used different techniques such as a greedy best-first search and integer linear programming for temporal inconsistency removal. We think that the experimental results of our EM based algorithm, as a first step toward a fully unsupervised temporal relation extraction method, is encouraging.