Goto

Collaborating Authors

 Support Vector Machines


Autism Spectrum Disorder Classification using Graph Kernels on Multidimensional Time Series

arXiv.org Machine Learning

We present an approach to model time series data from resting state fMRI for autism spectrum disorder (ASD) severity classification. We propose to adopt kernel machines and employ graph kernels that define a kernel dot product between two graphs. This enables us to take advantage of spatio-temporal information to capture the dynamics of the brain network, as opposed to aggregating them in the spatial or temporal dimension. In addition to the conventional similarity graphs, we explore the use of L1 graph using sparse coding, and the persistent homology of time delay embeddings, in the proposed pipeline for ASD classification. In our experiments on two datasets from the ABIDE collection, we demonstrate a consistent and significant advantage in using graph kernels over traditional linear or non linear kernels for a variety of time series features.


Predicting the Higgs-Boson Signal

@machinelearnbot

The Higgs Boson is a landmark discovery that will help us to understand the basic nature of the universe. It was discovered first by the ATLAS experiment at the Large Hadron Collider, CERN in 2012. The Higg's Boson decays into two tau particles giving rise to a small signal buried in background noise. The goal of the Higgs Boson Machine Learning Challenge was to classify the characterizing events detected by ATLAS into "tau tau decay of a Higgs boson" versus "background." First step was to analyze the data and look for Missingness in the data. We found that the missing columns have some interesting pattern and they depend on the columns "PRI_jet_column", which is the number of jets having integer values of 0,1,2, or 3 where larger values has been caped at 3. The Jets are the experimental signatures of quarks and gluons produced in high-energy processes such as head-on proton-proton collisions. For PRI_jet_column 0, there were 10 columns having NULL values (-999), these are the columns which describe the Jet when it is equal to 0. For example, "DER_mass_jet_jet", the invariant mass (20) of the two jets (undefined if PRI jet num 1).So, it does not make sense to take into account the attributes of the jet(s), since they don't exist. For "PRI_jet_column" 1, there were 7 columns having NULL values and they describe the jets when their number is 2, So we deleted these 7 columns. For "PRI_jet_column" 2 or 3, we did not delete any columns.


How to choose machine learning algorithms

#artificialintelligence

The answer to the question "What machine learning algorithm should I use?" is always "It depends." It depends on the size, quality, and nature of the data. It depends what you want to do with the answer. It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have. Even the most experienced data scientists can't tell which algorithm will perform best before trying them. The Microsoft Azure Machine Learning Algorithm Cheat Sheet helps you choose the right machine learning algorithm for your predictive analytics solutions from the Microsoft Azure Machine Learning library of algorithms.


Will AI replace judges and lawyers?

#artificialintelligence

Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e.


Image Recognition and Object Detection : Part 1

#artificialintelligence

Before a classification algorithm can do its magic, we need to train it by showing thousands of examples of cats and backgrounds. Different learning algorithms learn differently, but the general principle is that learning algorithms treat feature vectors as points in higher dimensional space, and try to find planes / surfaces that partition the higher dimensional space in such a way that all examples belonging to the same class are on one side of the plane / surface. To simplify things, let us look at one learning algorithm called Support Vector Machines ( SVM) in some detail. Support Vector Machine ( SVM) is one of the most popular supervised binary classification algorithm. Although the ideas used in SVM have been around since 1963, the current version was proposed in 1995 by Cortes and Vapnik. In the previous step, we learned that the HOG descriptor of an image is a feature vector of length 3780. We can think of this vector as a point in a 3780-dimensional space. Visualizing higher dimensional space is impossible, so let us simplify things a bit and imagine the feature vector was just two dimensional. In our simplified world, we now have 2D points representing the two classes ( e.g.


Algebraic multigrid support vector machines

arXiv.org Machine Learning

The support vector machine is a flexible optimization-based technique widely used for classification problems. In practice, its training part becomes computationally expensive on large-scale data sets because of such reasons as the complexity and number of iterations in parameter fitting methods, underlying optimization solvers, and nonlinearity of kernels. We introduce a fast multilevel framework for solving support vector machine models that is inspired by the algebraic multigrid. Significant improvement in the running has been achieved without any loss in the quality. The proposed technique is highly beneficial on imbalanced sets. We demonstrate computational results on publicly available and industrial data sets.


An Efficient Training Algorithm for Kernel Survival Support Vector Machines

arXiv.org Machine Learning

Survival analysis is a fundamental tool in medical research to identify predictors of adverse events and develop systems for clinical decision support. In order to leverage large amounts of patient data, efficient optimisation routines are paramount. We propose an efficient training algorithm for the kernel survival support vector machine (SSVM). We directly optimise the primal objective function and employ truncated Newton optimisation and order statistic trees to significantly lower computational costs compared to previous training algorithms, which require $O(n^4)$ space and $O(p n^6)$ time for datasets with $n$ samples and $p$ features. Our results demonstrate that our proposed optimisation scheme allows analysing data of a much larger scale with no loss in prediction performance. Experiments on synthetic and 5 real-world datasets show that our technique outperforms existing kernel SSVM formulations if the amount of right censoring is high ($\geq85\%$), and performs comparably otherwise.


One-Class SVM with Privileged Information and its Application to Malware Detection

arXiv.org Machine Learning

Abstract--A number of important applied problems in engineering, finance and medicine can be formulated as a problem of anomaly detection based on a one-class classification. A classical approach to this problem is to describe a normal state using a one-class support vector machine. Then to detect anomalies we quantify a distance from a new observation to the constructed description of the normal class. In this paper we present a new approach to one-class classification. We formulate a new problem statement and a corresponding algorithm that allow taking into account privileged information during the training phase. We evaluate performance of the proposed approach using synthetic datasets, as well as the publicly available Microsoft Malware Classification Challenge dataset. Anomaly detection refers to the problem of finding patterns in data that do not conform to an expected behaviour.


Machine-Learning Discovery And Design Of Membrane-Active Peptides For Biomedicine

#artificialintelligence

There are approximately 1,100 known antimicrobial peptides (AMP) with diverse sequences that can permeate microbial membranes. To help discover the "blueprint" for natural AMP sequences, researchers from the University of Illinois at Urbana-Champaign and the University of California, Los Angeles, have developed a new machine learning approach to discover and design alpha-helical membrane active peptides based on their physicochemical properties. "In this work, we have trained a machine learning classifier--known as a support vector machine--to recognize membrane activity and experimentally calibrated the recognition metric by peptide synthesis and characterization," explained Andrew Ferguson, an assistant professor of materials science and engineering at Illinois. "We use machine learning to not only discover new membrane active peptides, but to also identify membrane activity in known peptides with previously defined functions leading us to discover membrane activity in diverse and unexpected peptide families. "Since getting cargo into a cell is important for many applications, we anticipate that this tool can have broad biomedical implications including in immunotherapy and in broad-spectrum membrane-active antimicrobial peptides to combat the rising incidence of drug resistance, design of cationic cell-penetrating peptides for nucleic acid transfection into cells, and in targeting and permeating anticancer therapeutics into tumors," added Ferguson, who was the senior computational investigator for the project. In this collaborative work, the Illinois researchers developed the computational innovations, with the experimental testing of the predictions accomplished at UCLA. The results, which highlight the difference between the efficacy of an antimicrobial and its recognizability as such, are surprising. "AMPs do not share a common core structure, but tend to be short, cationic, and amphiphilic," Ferguson said. "By training our machine learning classifier over a training set comprising peptides with known antimicrobial activity (hits) and decoy peptides with no activity (misses), the classifier learned the physical and chemical properties of a peptide that make for good membrane activity.


A Multi-Modal Graph-Based Semi-Supervised Pipeline for Predicting Cancer Survival

arXiv.org Machine Learning

Cancer survival prediction is an active area of research that can help prevent unnecessary therapies and improve patient's quality of life. Gene expression profiling is being widely used in cancer studies to discover informative biomarkers that aid predict different clinical endpoint prediction. We use multiple modalities of data derived from RNA deep-sequencing (RNA-seq) to predict survival of cancer patients. Despite the wealth of information available in expression profiles of cancer tumors, fulfilling the aforementioned objective remains a big challenge, for the most part, due to the paucity of data samples compared to the high dimension of the expression profiles. As such, analysis of transcriptomic data modalities calls for state-of-the-art big-data analytics techniques that can maximally use all the available data to discover the relevant information hidden within a significant amount of noise. In this paper, we propose a pipeline that predicts cancer patients' survival by exploiting the structure of the input (manifold learning) and by leveraging the unlabeled samples using Laplacian support vector machines, a graph-based semi supervised learning (GSSL) paradigm. We show that under certain circumstances, no single modality per se will result in the best accuracy and by fusing different models together via a stacked generalization strategy, we may boost the accuracy synergistically. We apply our approach to two cancer datasets and present promising results. We maintain that a similar pipeline can be used for predictive tasks where labeled samples are expensive to acquire.