Goto

Collaborating Authors

 Europe


Constrained Mixture Models for Asset Returns Modelling

arXiv.org Machine Learning

The estimation of asset return distributions is crucial for determining optimal trading strategies. One convenient estimation approach selects a distribution model and estimates its parameters. The advantage of this approach is the ease with which probability distributions can be calibrated and applied in post-processing. The disadvantage of assuming a particular parametric distribution is that inferences and decisions depend critically on the choice of distribution. For example, asset returns frequently feature large "outlying" values, making distributions with light tails inapplicable. Semi-parametric methods attempt to capture the advantages but not the disadvantages of a parametric specification of a returns distribution by using a more flexible functional form. Most prominent among the semi-parametric distributions are mixtures of distributions. They provide a flexible specification and, under certain conditions, can approximate distributions of any form.


SPPAM - Statistical PreProcessing AlgorithM

arXiv.org Artificial Intelligence

Most machine learning tools work with a single table where each row is an instance and each column is an attribute. Each cell of the table contains an attribute value for an instance. This representation prevents one important form of learning, which is, classification based on groups of correlated records, such as multiple exams of a single patient, internet customer preferences, weather forecast or prediction of sea conditions for a given day. To some extent, relational learning methods, such as inductive logic programming, can capture this correlation through the use of intensional predicates added to the background knowledge. In this work, we propose SPPAM, an algorithm that aggregates past observations in one single record. We show that applying SPPAM to the original correlated data, before the learning task, can produce classifiers that are better than the ones trained using all records.


Self reference in word definitions

arXiv.org Artificial Intelligence

Dictionaries are inherently circular in nature. A given word is linked to a set of alternative words (the definition) which in turn point to further descendants. Iterating through definitions in this way, one typically finds that definitions loop back upon themselves. The graph formed by such definitional relations is our object of study. By eliminating those links which are not in loops, we arrive at a core subgraph of highly connected nodes. We observe that definitional loops are conveniently classified by length, with longer loops usually emerging from semantic misinterpretation. By breaking the long loops in the graph of the dictionary, we arrive at a set of disconnected clusters. We find that the words in these clusters constitute semantic units, and moreover tend to have been introduced into the English language at similar times, suggesting a possible mechanism for language evolution.


Language, Emotions, and Cultures: Emotional Sapir-Whorf Hypothesis

arXiv.org Artificial Intelligence

An emotional version of Sapir-Whorf hypothesis suggests that differences in language emotionalities influence differences among cultures no less than conceptual differences. Conceptual contents of languages and cultures to significant extent are determined by words and their semantic differences; these could be borrowed among languages and exchanged among cultures. Emotional differences, as suggested in the paper, are related to grammar and mostly cannot be borrowed. Conceptual and emotional mechanisms of languages are considered here along with their functions in the mind and cultural evolution. A fundamental contradiction in human mind is considered: language evolution requires reduced emotionality, but "too low" emotionality makes language "irrelevant to life," disconnected from sensory-motor experience. Neural mechanisms of these processes are suggested as well as their mathematical models: the knowledge instinct, the language instinct, the dual model connecting language and cognition, dynamic logic, neural modeling fields. Mathematical results are related to cognitive science, linguistics, and psychology. Experimental evidence and theoretical arguments are discussed. Approximate equations for evolution of human minds and cultures are obtained. Their solutions identify three types of cultures: "conceptual"-pragmatic cultures, in which emotionality of language is reduced and differentiation overtakes synthesis resulting in fast evolution at the price of uncertainty of values, self doubts, and internal crises; "traditional-emotional" cultures where differentiation lags behind synthesis, resulting in cultural stability at the price of stagnation; and "multi-cultural" societies combining fast cultural evolution and stability. Unsolved problems and future theoretical and experimental directions are discussed.


GRASP and path-relinking for Coalition Structure Generation

arXiv.org Artificial Intelligence

In Artificial Intelligence with Coalition Structure Generation (CSG) one refers to those cooperative complex problems that require to find an optimal partition, maximising a social welfare, of a set of entities involved in a system into exhaustive and disjoint coalitions. The solution of the CSG problem finds applications in many fields such as Machine Learning (covering machines, clustering), Data Mining (decision tree, discretization), Graph Theory, Natural Language Processing (aggregation), Semantic Web (service composition), and Bioinformatics. The problem of finding the optimal coalition structure is NP-complete. In this paper we present a greedy adaptive search procedure (GRASP) with path-relinking to efficiently search the space of coalition structures. Experiments and comparisons to other algorithms prove the validity of the proposed method in solving this hard combinatorial problem.


Design of Automatically Adaptable Web Wrappers

arXiv.org Artificial Intelligence

Nowadays, the huge amount of information distributed through the Web motivates studying techniques to be adopted in order to extract relevant data in an efficient and reliable way. Both academia and enterprises developed several approaches of Web data extraction, for example using techniques of artificial intelligence or machine learning. Some commonly adopted procedures, namely wrappers, ensure a high degree of precision of information extracted from Web pages, and, at the same time, have to prove robustness in order not to compromise quality and reliability of data themselves. In this paper we focus on some experimental aspects related to the robustness of the data extraction process and the possibility of automatically adapting wrappers. We discuss the implementation of algorithms for finding similarities between two different version of a Web page, in order to handle modifications, avoiding the failure of data extraction tasks and ensuring reliability of information extracted. Our purpose is to evaluate performances, advantages and draw-backs of our novel system of automatic wrapper adaptation.


Automatic Wrapper Adaptation by Tree Edit Distance Matching

arXiv.org Artificial Intelligence

Information distributed through the Web keeps growing faster day by day, and for this reason, several techniques for extracting Web data have been suggested during last years. Often, extraction tasks are performed through so called wrappers, procedures extracting information from Web pages, e.g. implementing logic-based techniques. Many fields of application today require a strong degree of robustness of wrappers, in order not to compromise assets of information or reliability of data extracted. Unfortunately, wrappers may fail in the task of extracting data from a Web page, if its structure changes, sometimes even slightly, thus requiring the exploiting of new techniques to be automatically held so as to adapt the wrapper to the new structure of the page, in case of failure. In this work we present a novel approach of automatic wrapper adaptation based on the measurement of similarity of trees through improved tree edit distance matching techniques.


Estimating $\beta$-mixing coefficients

arXiv.org Machine Learning

The literature on statistical learning for time series assumes the asymptotic independence or ``mixing' of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing rates from data. We give an estimator for the $\beta$-mixing rate based on a single stationary sample path and show it is $L_1$-risk consistent.


The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning

arXiv.org Machine Learning

We derive an upper bound on the local Rademacher complexity of $\ell_p$-norm multiple kernel learning, which yields a tighter excess risk bound than global approaches. Previous local approaches aimed at analyzed the case $p=1$ only while our analysis covers all cases $1\leq p\leq\infty$, assuming the different feature mappings corresponding to the different kernels to be uncorrelated. We also show a lower bound that shows that the bound is tight, and derive consequences regarding excess loss, namely fast convergence rates of the order $O(n^{-\frac{\alpha}{1+\alpha}})$, where $\alpha$ is the minimum eigenvalue decay rate of the individual kernels.


An Agent Based Architecture (Using Planning) for Dynamic and Semantic Web Services Composition in an EBXML Context

arXiv.org Artificial Intelligence

The process-based semantic composition of Web Services is gaining a considerable momentum as an approach for the effective integration of distributed, heterogeneous, and autonomous applications. To compose Web Services semantically, we need an ontology. There are several ways of inserting semantics in Web Services. One of them consists of using description languages like OWL-S. In this paper, we introduce our work which consists in the proposition of a new model and the use of semantic matching technology for semantic and dynamic composition of ebXML business processes.