Goto

Collaborating Authors

 Country


Learning to Predict Combinatorial Structures

arXiv.org Artificial Intelligence

The major challenge in designing a discriminative learning algorithm for predicting structured data is to address the computational issues arising from the exponential size of the output space. Existing algorithms make different assumptions to ensure efficient, polynomial time estimation of model parameters. For several combinatorial structures, including cycles, partially ordered sets, permutations and other graph classes, these assumptions do not hold. In this thesis, we address the problem of designing learning algorithms for predicting combinatorial structures by introducing two new assumptions: (i) The first assumption is that a particular counting problem can be solved efficiently. The consequence is a generalisation of the classical ridge regression for structured prediction. (ii) The second assumption is that a particular sampling problem can be solved efficiently. The consequence is a new technique for designing and analysing probabilistic structured prediction models. These results can be applied to solve several complex learning problems including but not limited to multi-label classification, multi-category hierarchical classification, and label ranking.


Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

arXiv.org Artificial Intelligence

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.


Fast Set Bounds Propagation Using a BDD-SAT Hybrid

Journal of Artificial Intelligence Research

Binary Decision Diagram (BDD) based set bounds propagation is a powerful approach to solving set-constraint satisfaction problems. However, prior BDD based techniques in- cur the significant overhead of constructing and manipulating graphs during search. We present a set-constraint solver which combines BDD-based set-bounds propagators with the learning abilities of a modern SAT solver. Together with a number of improvements beyond the basic algorithm, this solver is highly competitive with existing propagation based set constraint solvers.


Automatic Music Composition using Answer Set Programming

arXiv.org Artificial Intelligence

Music composition used to be a pen and paper activity. These these days music is often composed with the aid of computer software, even to the point where the computer compose parts of the score autonomously. The composition of most styles of music is governed by rules. We show that by approaching the automation, analysis and verification of composition as a knowledge representation task and formalising these rules in a suitable logical language, powerful and expressive intelligent composition tools can be easily built. This application paper describes the use of answer set programming to construct an automated system, named ANTON, that can compose melodic, harmonic and rhythmic music, diagnose errors in human compositions and serve as a computer-aided composition tool. The combination of harmonic, rhythmic and melodic composition in a single framework makes ANTON unique in the growing area of algorithmic composition. With near real-time composition, ANTON reaches the point where it can not only be used as a component in an interactive composition tool but also has the potential for live performances and concerts or automatically generated background music in a variety of applications. With the use of a fully declarative language and an "off-the-shelf" reasoning engine, ANTON provides the human composer a tool which is significantly simpler, more compact and more versatile than other existing systems. This paper has been accepted for publication in Theory and Practice of Logic Programming (TPLP).


Detecting Danger: The Dendritic Cell Algorithm

arXiv.org Artificial Intelligence

The Dendritic Cell Algorithm (DCA) is inspired by the function of the dendritic cells of the human immune system. In nature, dendritic cells are the intrusion detection agents of the human body, policing the tissue and organs for potential invaders in the form of pathogens. In this research, and abstract model of DC behaviour is developed and subsequently used to form an algorithm, the DCA. The abstraction process was facilitated through close collaboration with laboratory- based immunologists, who performed bespoke experiments, the results of which are used as an integral part of this algorithm. The DCA is a population based algorithm, with each agent in the system represented as an 'artificial DC'. Each DC has the ability to combine multiple data streams and can add context to data suspected as anomalous. In this chapter the abstraction process and details of the resultant algorithm are given. The algorithm is applied to numerous intrusion detection problems in computer security including the detection of port scans and botnets, where it has produced impressive results with relatively low rates of false positives.


Artificial Immune Systems (2010)

arXiv.org Artificial Intelligence

The human immune system has numerous properties that make it ripe for exploitation in the computational domain, such as robustness and fault tolerance, and many different algorithms, collectively termed Artificial Immune Systems (AIS), have been inspired by it. Two generations of AIS are currently in use, with the first generation relying on simplified immune models and the second generation utilising interdisciplinary collaboration to develop a deeper understanding of the immune system and hence produce more complex models. Both generations of algorithms have been successfully applied to a variety of problems, including anomaly detection, pattern recognition, optimisation and robotics. In this chapter an overview of AIS is presented, its evolution is discussed, and it is shown that the diversification of the field is linked to the diversity of the immune system itself, leading to a number of algorithms as opposed to one archetypal system. Two case studies are also presented to help provide insight into the mechanisms of AIS; these are the idiotypic network approach and the Dendritic Cell Algorithm.


Heavy-Tailed Processes for Selective Shrinkage

arXiv.org Machine Learning

Heavy-tailed distributions are frequently used to enhance the robustness of regression and classification methods to outliers in output space. Often, however, we are confronted with "outliers" in input space, which are isolated observations in sparsely populated regions. We show that heavy-tailed stochastic processes (which we construct from Gaussian processes via a copula), can be used to improve robustness of regression and classification estimators to such outliers by selectively shrinking them more strongly in sparse regions than in dense regions. We carry out a theoretical analysis to show that selective shrinkage occurs, provided the marginals of the heavy-tailed process have sufficiently heavy tails. The analysis is complemented by experiments on biological data which indicate significant improvements of estimates in sparse regions while producing competitive results in dense regions.


SPOT: An R Package For Automatic and Interactive Tuning of Optimization Algorithms by Sequential Parameter Optimization

arXiv.org Artificial Intelligence

The sequential parameter optimization (SPOT) package for R is a toolbox for tuning and understanding simulation and optimization algorithms. Model-based investigations are common approaches in simulation and optimization. Sequential parameter optimization has been developed, because there is a strong need for sound statistical analysis of simulation and optimization algorithms. SPOT includes methods for tuning based on classical regression and analysis of variance techniques; tree-based models such as CART and random forest; Gaussian process models (Kriging), and combinations of different meta-modeling approaches. This article exemplifies how SPOT can be used for automatic and interactive tuning.


Understanding Semantic Web and Ontologies: Theory and Applications

arXiv.org Artificial Intelligence

One of the most interesting inventions, in recent decades, is that of Web Services [36]. These are computer program "applications": self-describing, selfcontained applications whose function is to automatically share information over the Internet with other applications. Some weaknesses such as browsing information without taking its meaning into account have recently appeared in Web Services. This creates a need for a new Web with more relevance to the user. Semantic Web is actually an extension of the current one in that it represents information more meaningfully for humans and computers alike. It enables the description of contents and services in machine-readable form, and enables annotating, discovering, publishing, advertising and composing services to be automated. It was developed based on Ontology, which is considered as the backbone of the Semantic Web. In other words, the current Web is transformed from being machine-readable to machineunderstandable. One function of the Web is to build a source of reference for information on several subjects, while the Semantic Web is designed to build a web of meaning.


The State of the Art: Ontology Web-Based Languages: XML Based

arXiv.org Artificial Intelligence

Many formal languages have been proposed to express or represent Ontologies, including RDF, RDFS, DAML OIL and OWL. Most of these languages are based on XML syntax, but with various terminologies and expressiveness. Therefore, choosing a language for building an Ontology is the main step. The main point of choosing language to represent Ontology is based mainly on what the Ontology will represent or be used for. That language should have a range of quality support features such as ease of use, expressive power, compatibility, sharing and versioning, internationalisation. This is because different kinds of knowledge-based applications need different language features. The main objective of these languages is to add semantics to the existing information on the web. The aims of this paper is to provide a good knowledge of existing language and understanding of these languages and how could be used.