Plotting

 Country


XML Representation of Constraint Networks: Format XCSP 2.1

arXiv.org Artificial Intelligence

We propose a new extended format to represent constraint networks using XML. This format allows us to represent constraints defined either in extension or in intension. It also allows us to reference global constraints. Any instance of the problems CSP (Constraint Satisfaction Problem), QCSP (Quantified CSP) and WCSP (Weighted CSP) can be represented using this format.


Topological Centrality and Its Applications

arXiv.org Artificial Intelligence

Recent development of network structure analysis shows that it plays an important role in characterizing complex system of many branches of sciences. Different from previous network centrality measures, this paper proposes the notion of topological centrality (TC) reflecting the topological positions of nodes and edges in general networks, and proposes an approach to calculating the topological centrality. The proposed topological centrality is then used to discover communities and build the backbone network. Experiments and applications on research network show the significance of the proposed approach.


Sparse Conformal Predictors

arXiv.org Machine Learning

Conformal predictors, introduced by Vovk et al. (2005), serve to build prediction intervals by exploiting a notion of conformity of the new data point with previously observed data. In the present paper, we propose a novel method for constructing prediction intervals for the response variable in multivariate linear models. The main emphasis is on sparse linear models, where only few of the covariates have significant influence on the response variable even if their number is very large. Our approach is based on combining the principle of conformal prediction with the $\ell_1$ penalized least squares estimator (LASSO). The resulting confidence set depends on a parameter $\epsilon>0$ and has a coverage probability larger than or equal to $1-\epsilon$. The numerical experiments reported in the paper show that the length of the confidence set is small. Furthermore, as a by-product of the proposed approach, we provide a data-driven procedure for choosing the LASSO penalty. The selection power of the method is illustrated on simulated data.


Improvements of real coded genetic algorithms based on differential operators preventing premature convergence

arXiv.org Artificial Intelligence

This paper presents several types of evolutionary algorithms (EAs) used for global optimization on real domains. The interest has been focused on multimodal problems, where the difficulties of a premature convergence usually occurs. First the standard genetic algorithm (SGA) using binary encoding of real values and its unsatisfactory behavior with multimodal problems is briefly reviewed together with some improvements of fighting premature convergence. Two types of real encoded methods based on differential operators are examined in detail: the differential evolution (DE), a very modern and effective method firstly published by R. Storn and K. Price, and the simplified real-coded differential genetic algorithm SADE proposed by the authors. In addition, an improvement of the SADE method, called CERAF technology, enabling the population of solutions to escape from local extremes, is examined. All methods are tested on an identical set of objective functions and a systematic comparison based on a reliable methodology is presented. It is confirmed that real coded methods generally exhibit better behavior on real domains than the binary algorithms, even when extended by several improvements. Furthermore, the positive influence of the differential operators due to their possibility of self-adaptation is demonstrated. From the reliability point of view, it seems that the real encoded differential algorithm, improved by the technology described in this paper, is a universal and reliable method capable of solving all proposed test problems.


Back analysis of microplane model parameters using soft computing methods

arXiv.org Artificial Intelligence

A new procedure based on layered feed-forward neural networks for the microplane material model parameters identification is proposed in the present paper. Novelties are usage of the Latin Hypercube Sampling method for the generation of training sets, a systematic employment of stochastic sensitivity analysis and a genetic algorithm-based training of a neural network by an evolutionary algorithm. Advantages and disadvantages of this approach together with possible extensions are thoroughly discussed and analyzed.


A competitive comparison of different types of evolutionary algorithms

arXiv.org Artificial Intelligence

This paper presents comparison of several stochastic optimization algorithms developed by authors in their previous works for the solution of some problems arising in Civil Engineering. The introduced optimization methods are: the integer augmented simulated annealing (IASA), the real-coded augmented simulated annealing (RASA), the differential evolution (DE) in its original fashion developed by R. Storn and K. Price and simplified real-coded differential genetic algorithm (SADE). Each of these methods was developed for some specific optimization problem; namely the Chebychev trial polynomial problem, the so called type 0 function and two engineering problems - the reinforced concrete beam layout and the periodic unit cell problem respectively. Detailed and extensive numerical tests were performed to examine the stability and efficiency of proposed algorithms. The results of our experiments suggest that the performance and robustness of RASA, IASA and SADE methods are comparable, while the DE algorithm performs slightly worse. This fact together with a small number of internal parameters promotes the SADE method as the most robust for practical use.


Sparse partial least squares for on-line variable selection in multivariate data streams

arXiv.org Machine Learning

In this paper we propose a computationally efficient algorithm for on-line variable selection in multivariate regression problems involving high dimensional data streams. The algorithm recursively extracts all the latent factors of a partial least squares solution and selects the most important variables for each factor. This is achieved by means of only one sparse singular value decomposition which can be efficiently updated on-line and in an adaptive fashion. Simulation results based on artificial data streams demonstrate that the algorithm is able to select important variables in dynamic settings where the correlation structure among the observed streams is governed by a few hidden components and the importance of each variable changes over time. We also report on an application of our algorithm to a multivariate version of the "enhanced index tracking" problem using financial data streams. The application consists of performing on-line asset allocation with the objective of overperforming two benchmark indices simultaneously.


A Model for Managing Collections of Patterns

arXiv.org Artificial Intelligence

Data mining algorithms are now able to efficiently deal with huge amount of data. Various kinds of patterns may be discovered and may have some great impact on the general development of knowledge. In many domains, end users may want to have their data mined by data mining tools in order to extract patterns that could impact their business. Nevertheless, those users are often overwhelmed by the large quantity of patterns extracted in such a situation. Moreover, some privacy issues, or some commercial one may lead the users not to be able to mine the data by themselves. Thus, the users may not have the possibility to perform many experiments integrating various constraints in order to focus on specific patterns they would like to extract. Post processing of patterns may be an answer to that drawback. Thus, in this paper we present a framework that could allow end users to manage collections of patterns. We propose to use an efficient data structure on which some algebraic operators may be used in order to retrieve or access patterns in pattern bases.


Infinite Viterbi alignments in the two state hidden Markov models

arXiv.org Machine Learning

Since the early days of digital communication, Hidden Markov Models (HMMs) have now been routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. An HMM $(X_i,Y_i)_{i\ge 1}$ assumes observations $X_1,X_2,...$ to be conditionally independent given an "explanotary" Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional distribution of $X_i$ depends solely on $Y_i$. Central to the theory and applications of HMM is the Viterbi algorithm to find {\em a maximum a posteriori} estimate $q_{1:n}=(q_1,q_2,...,q_n)$ of $Y_{1:n}$ given the observed data $x_{1:n}$. Maximum {\em a posteriori} paths are also called Viterbi paths or alignments. Recently, attempts have been made to study the behavior of Viterbi alignments of HMMs with two hidden states when $n$ tends to infinity. It has indeed been shown that in some special cases a well-defined limiting Viterbi alignment exists. While innovative, these attempts have relied on rather strong assumptions. This work proves the existence of infinite Viterbi alignments for virtually any HMM with two hidden states.


Embedding Data within Knowledge Spaces

arXiv.org Artificial Intelligence

The promise of e-Science will only be realized when data is discoverable, accessible, and comprehensible within distributed teams, across disciplines, and over the long-term--without reliance on out-of-band (non-digital) means. We have developed the open-source Tupelo semantic content management framework and are employing it to manage a wide range of e-Science entities (including data, documents, workflows, people, and projects) and a broad range of metadata (including provenance, social networks, geospatial relationships, temporal relations, and domain descriptions). Tupelo couples the use of global identifiers and resource description framework (RDF) statements with an aggregatable content repository model to provide a unified space for securely managing distributed heterogeneous content and relationships.