Support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. (Wikipedia)
This paper introduces an enhanced meta-heuristic (ML-ACO) that combines machine learning (ML) and ant colony optimization (ACO) to solve combinatorial optimization problems. To illustrate the underlying mechanism of our enhanced algorithm, we start by describing a test problem -- the orienteering problem -- used to demonstrate the efficacy of ML-ACO. In this problem, the objective is to find a route that visits a subset of vertices in a graph within a time budget to maximize the collected score. In the first phase of our ML-ACO algorithm, an ML model is trained using a set of small problem instances where the optimal solution is known. Specifically, classification models are used to classify an edge as being part of the optimal route, or not, using problem-specific features and statistical measures. We have tested several classification models including graph neural networks, logistic regression and support vector machines. The trained model is then used to predict the probability that an edge in the graph of a test problem instance belongs to the corresponding optimal route. In the second phase, we incorporate the predicted probabilities into the ACO component of our algorithm. Here, the probability values bias sampling towards favoring those predicted high-quality edges when constructing feasible routes. We empirically show that ML-ACO generates results that are significantly better than the standard ACO algorithm, especially when the computational budget is limited. Furthermore, we show our algorithm is robust in the sense that (a) its overall performance is not sensitive to any particular classification model, and (b) it generalizes well to large and real-world problem instances. Our approach integrating ML with a meta-heuristic is generic and can be applied to a wide range of combinatorial optimization problems.
Computers have been used to analyze and create music since they were first introduced in the 1950s and 1960s. Beginning in the late 1990s, the rise of the Internet and large scale platforms for music recommendation and retrieval have made music an increasingly prevalent domain of machine learning and artificial intelligence research. While still nascent, several different approaches have been employed to tackle what may broadly be referred to as "musical intelligence." This article provides a definition of musical intelligence, introduces a taxonomy of its constituent components, and surveys the wide range of AI methods that can be, and have been, brought to bear in its pursuit, with a particular emphasis on machine learning methods.
According to the no-free-lunch theorem, there is no single meta-heuristic algorithm that can optimally solve all optimization problems. This motivates many researchers to continuously develop new optimization algorithms. In this paper, a novel nature-inspired meta-heuristic optimization algorithm called virus spread optimization (VSO) is proposed. VSO loosely mimics the spread of viruses among hosts, and can be effectively applied to solving many challenging and continuous optimization problems. We devise a new representation scheme and viral operations that are radically different from previously proposed virus-based optimization algorithms. First, the viral RNA of each host in VSO denotes a potential solution for which different viral operations will help to diversify the searching strategies in order to largely enhance the solution quality. In addition, an imported infection mechanism, inheriting the searched optima from another colony, is introduced to possibly avoid the prematuration of any potential solution in solving complex problems. VSO has an excellent capability to conduct adaptive neighborhood searches around the discovered optima for achieving better solutions. Furthermore, with a flexible infection mechanism, VSO can quickly escape from local optima. To clearly demonstrate both its effectiveness and efficiency, VSO is critically evaluated on a series of well-known benchmark functions. Moreover, VSO is validated on its applicability through two real-world examples including the financial portfolio optimization and optimization of hyper-parameters of support vector machines for classification problems. The results show that VSO has attained superior performance in terms of solution fitness, convergence rate, scalability, reliability, and flexibility when compared to those results of the conventional as well as state-of-the-art meta-heuristic optimization algorithms.
This has led to the development of a plethora of domain-dependent and context-specific methods for dealing with the interpretation of machine learning (ML) models and the formation of explanations for humans. Unfortunately, this trend is far from being over, with an abundance of knowledge in the field which is scattered and needs organisation. The goal of this article is to systematically review research works in the field of XAI and to try to define some boundaries in the field. From several hundreds of research articles focused on the concept of explainability, about 350 have been considered for review by using the following search methodology. In a first phase, Google Scholar was queried to find papers related to "explainable artificial intelligence", "explainable machine learning" and "interpretable machine learning". Subsequently, the bibliographic section of these articles was thoroughly examined to retrieve further relevant scientific studies. The first noticeable thing, as shown in figure 2 (a), is the distribution of the publication dates of selected research articles: sporadic in the 70s and 80s, receiving preliminary attention in the 90s, showing raising interest in 2000 and becoming a recognised body of knowledge after 2010. The first research concerned the development of an explanation-based system and its integration in a computer program designed to help doctors make diagnoses . Some of the more recent papers focus on work devoted to the clustering of methods for explainability, motivating the need for organising the XAI literature [4, 5, 6].
The overarching goal of Explainable AI is to develop systems that not only exhibit intelligent behaviours, but also are able to explain their rationale and reveal insights. In explainable machine learning, methods that produce a high level of prediction accuracy as well as transparent explanations are valuable. In this work, we present an explainable classification method. Our method works by first constructing a symbolic Knowledge Base from the training data, and then performing probabilistic inferences on such Knowledge Base with linear programming. Our approach achieves a level of learning performance comparable to that of traditional classifiers such as random forests, support vector machines and neural networks. It identifies decisive features that are responsible for a classification as explanations and produces results similar to the ones found by SHAP, a state of the art Shapley Value based method. Our algorithms perform well on a range of synthetic and non-synthetic data sets.
Learning a sequence of tasks is a long-standing challenge in machine learning. This setting applies to learning systems that observe examples of a range of tasks at different points in time. A learning system should become more knowledgeable as more related tasks are learned. Although the problem of learning sequentially was acknowledged for the first time decades ago, the research in this area has been rather limited. Research in transfer learning, multitask learning, metalearning and deep learning has studied some challenges of these kinds of systems. Recent research in lifelong machine learning and continual learning has revived interest in this problem. We propose Proficiente, a full framework for long-term learning systems. Proficiente relies on knowledge transferred between hypotheses learned with Support Vector Machines. The first component of the framework is focused on transferring forward selectively from a set of existing hypotheses or functions representing knowledge acquired during previous tasks to a new target task. A second component of Proficiente is focused on transferring backward, a novel ability of long-term learning systems that aim to exploit knowledge derived from recent tasks to encourage refinement of existing knowledge. We propose a method that transfers selectively from a task learned recently to existing hypotheses representing previous tasks. The method encourages retention of existing knowledge whilst refining. We analyse the theoretical properties of the proposed framework. Proficiente is accompanied by an agnostic metric that can be used to determine if a long-term learning system is becoming more knowledgeable. We evaluate Proficiente in both synthetic and real-world datasets, and demonstrate scenarios where knowledgeable supervised learning systems can be achieved by means of transfer.
Local learning methods are a popular class of machine learning algorithms. The basic idea for the entire cadre is to choose some non-local model family, to train many of them on small sections of neighboring data, and then to `stitch' the resulting models together in some way. Due to the limits of constraining a training dataset to a small neighborhood, research on locally-learned models has largely been restricted to simple model families. Also, since simple model families have no complex structure by design, this has limited use of the individual local models to predictive tasks. We hypothesize that, using a sufficiently complex local model family, various properties of the individual local models, such as their learned parameters, can be used as features for further learning. This dissertation improves upon the current state of research and works toward establishing this hypothesis by investigating algorithms for localization of more complex model families and by studying their applications beyond predictions as a feature extraction mechanism. We summarize this generic technique of using local models as a feature extraction step with the term ``local model feature transformations.'' In this document, we extend the local modeling paradigm to Gaussian processes, orthogonal quadric models and word embedding models, and extend the existing theory for localized linear classifiers. We then demonstrate applications of local model feature transformations to epileptic event classification from EEG readings, activity monitoring via chest accelerometry, 3D surface reconstruction, 3D point cloud segmentation, handwritten digit classification and event detection from Twitter feeds.
Research in the supervised learning algorithms field implicitly assumes that training data is labeled by domain experts or at least semi-professional labelers accessible through crowdsourcing services like Amazon Mechanical Turk. With the advent of the Internet, data has become abundant and a large number of machine learning based systems started being trained with user-generated data, using categorical data as true labels. However, little work has been done in the area of supervised learning with user-defined labels where users are not necessarily experts and might be motivated to provide incorrect labels in order to improve their own utility from the system. In this article, we propose two types of classes in user-defined labels: subjective class and objective class - showing that the objective classes are as reliable as if they were provided by domain experts, whereas the subjective classes are subject to bias and manipulation by the user. We define this as a subjective class issue and provide a framework for detecting subjective labels in a dataset without querying oracle. Using this framework, data mining practitioners can detect a subjective class at an early stage of their projects, and avoid wasting their precious time and resources by dealing with subjective class problem with traditional machine learning techniques.
Automatic machine learning (\AML) is a family of techniques to automate the process of training predictive models, aiming to both improve performance and make machine learning more accessible. While many recent works have focused on aspects of the machine learning pipeline like model selection, hyperparameter tuning, and feature selection, relatively few works have focused on automatic data augmentation. Automatic data augmentation involves finding new features relevant to the user's predictive task with minimal ``human-in-the-loop'' involvement. We present \system, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set such that training a predictive model on this augmented dataset results in improved performance. Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join. We perform an extensive empirical evaluation of different system components and benchmark our feature selection algorithm on real-world datasets.
In this paper, we propose a novel method for transforming data into a low-dimensional space optimized for one-class classification. The proposed method iteratively transforms data into a new subspace optimized for ellipsoidal encapsulation of target class data. We provide both linear and non-linear formulations for the proposed method. The method takes into account the covariance of the data in the subspace; hence, it yields a more generalized solution as compared to Subspace Support Vector Data Description for a hypersphere. We propose different regularization terms expressing the class variance in the projected space. We compare the results with classic and recently proposed one-class classification methods and achieve better results in the majority of cases. The proposed method is also noticed to converge much faster than recently proposed Subspace Support Vector Data Description.