Goto

Collaborating Authors

 Support Vector Machines


Machine Learning in Javascript- A compilation of Resources

@machinelearnbot

Encog's machine learning framework in Javascript: Encog is a machine learning framework available for Java, .Net, and C . Encog supports different learning algorithms such as Bayesian Networks, Hidden Markov Models and Support Vector Machines. However, its main strength lay in its neural network algorithms. Encog contains classes to create a wide variety of networks, as well as support classes to normalize and process data for these neural networks. Multithreading is used to allow optimal training performance on multicore machines.


Development of a hybrid learning system based on SVM, ANFIS and domain knowledge: DKFIS

arXiv.org Machine Learning

This paper presents the development of a hybrid learning system based on Support Vector Machines (SVM), Adaptive Neuro-Fuzzy Inference System (ANFIS) and domain knowledge to solve prediction problem. The proposed two-stage Domain Knowledge based Fuzzy Information System (DKFIS) improves the prediction accuracy attained by ANFIS alone. The proposed framework has been implemented on a noisy and incomplete dataset acquired from a hydrocarbon field located at western part of India. Here, oil saturation has been predicted from four different well logs i.e. gamma ray, resistivity, density, and clay volume. In the first stage, depending on zero or near zero and non-zero oil saturation levels the input vector is classified into two classes (Class 0 and Class 1) using SVM. The classification results have been further fine-tuned applying expert knowledge based on the relationship among predictor variables i.e. well logs and target variable - oil saturation. Second, an ANFIS is designed to predict non-zero (Class 1) oil saturation values from predictor logs. The predicted output has been further refined based on expert knowledge. It is apparent from the experimental results that the expert intervention with qualitative judgment at each stage has rendered the prediction into the feasible and realistic ranges. The performance analysis of the prediction in terms of four performance metrics such as correlation coefficient (CC), root mean square error (RMSE), and absolute error mean (AEM), scatter index (SI) has established DKFIS as a useful tool for reservoir characterization.


A novel multiclassSVM based framework to classify lithology from well logs: a real-world application

arXiv.org Machine Learning

Support vector machines (SVMs) have been recognized as a potential tool for supervised classification analyses in different domains of research. In essence, SVM is a binary classifier. Therefore, in case of a multiclass problem, the problem is divided into a series of binary problems which are solved by binary classifiers, and finally the classification results are combined following either the one-against-one or one-against-all strategies. In this paper, an attempt has been made to classify lithology using a multiclass SVM based framework using well logs as predictor variables. Here, the lithology is classified into four classes such as sand, shaly sand, sandy shale and shale based on the relative values of sand and shale fractions as suggested by an expert geologist. The available dataset consisting well logs (gamma ray, neutron porosity, density, and P-sonic) and class information from four closely spaced wells from an onshore hydrocarbon field is divided into training and testing sets. We have used one-against-all strategy to combine the results of multiple binary classifiers. The reported results established the superiority of multiclass SVM compared to other classifiers in terms of classification accuracy. The selection of kernel function and associated parameters has also been investigated here. It can be envisaged from the results achieved in this study that the proposed framework based on multiclass SVM can further be used to solve classification problems. In future research endeavor, seismic attributes can be introduced in the framework to classify the lithology throughout a study area from seismic inputs.


A Novel Framework based on SVDD to Classify Water Saturation from Seismic Attributes

arXiv.org Machine Learning

Water saturation is an important property in reservoir engineering domain. Thus, satisfactory classification of water saturation from seismic attributes is beneficial for reservoir characterization. However, diverse and non-linear nature of subsurface attributes makes the classification task difficult. In this context, this paper proposes a generalized Support Vector Data Description (SVDD) based novel classification framework to classify water saturation into two classes (Class high and Class low) from three seismic attributes seismic impedance, amplitude envelop, and seismic sweetness. G-metric means and program execution time are used to quantify the performance of the proposed framework along with established supervised classifiers. The documented results imply that the proposed framework is superior to existing classifiers. The present study is envisioned to contribute in further reservoir modeling.


Learning with Hierarchical Gaussian Kernels

arXiv.org Machine Learning

Although kernel methods such as support vector machines are one of the state-of-the-art methods when it comes to fully automated learning, see e.g. the recent independent comparison [7], the recent years have shown that on complex datasets such as image, speech and video data, they clearly fall short compared to deep neural networks. One possible explanation for this superior behavior is certainly their deep architecture that makes it possible to represent highly complex functions with relatively few parameters. In particular, it is possible to amplify or suppress certain dimensions or features of the input data, or to combine features to new, more abstract features. Compared to this, standard kernels such as the popular Gaussian kernels simply treat every feature equally. In addition, most users of kernel machines probably stick to the very few standard kernels, often simply because there is in most cases no principled way for finding problem specific kernels.


Implementing your own k-nearest neighbour algorithm using Python

#artificialintelligence

In machine learning, you may often wish to build predictors that allows to classify things into categories based on some set of associated values. For example, it is possible to provide a diagnosis to a patient based on data from previous patients. Many algorithms have been developed for automated classification, and common ones include random forests, support vector machines, Naรฏve Bayes classifiers, and many types of neural networks. To get a feel for how classification works, we take a simple example of a classification algorithm โ€“ k-Nearest Neighbours (kNN) โ€“ and build it from scratch in Python 2. You can use a mostly imperative style of coding, rather than a declarative/functional one with lambda functions and list comprehensions to keep things simple if you are starting with Python. Here, we will provide an introduction to the latter approach.


Top 10 Amazon Books in Artificial Intelligence & Machine Learning โ€“ 2016 Edition

#artificialintelligence

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more.


Mastering Machine Learning with scikit-learn

#artificialintelligence

If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential. This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models.


Auditing Black-box Models for Indirect Influence

arXiv.org Machine Learning

Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior, and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models, or asserting that certain problematic attributes (like race or gender) are not unduly influencing decisions. In this paper, we present a technique for auditing black-box models, which lets us study the extent to which existing models take advantage of particular features in the dataset, without knowing how the models work. Our work focuses on the problem of indirect influence: how some features might indirectly influence outcomes via other, related features. As a result, we can find attribute influences even in cases where, upon further direct examination of the model, the attribute is not referred to by the model at all. Our approach does not require the black-box model to be retrained. This is important if (for example) the model is only accessible via an API, and contrasts our work with other methods that investigate feature influence like feature selection. We present experimental evidence for the effectiveness of our procedure using a variety of publicly available datasets and models. We also validate our procedure using techniques from interpretable learning and feature selection, as well as against other black-box auditing procedures.


Stock Price Prediction With Big Data and Machine Learning - Eugene Zhulenev

#artificialintelligence

This post is based on Modeling high-frequency limit order book dynamics with support vector machines paper. Roughly speaking I'm implementing ideas introduced in this paper in scala with Spark and Spark MLLib. Authors are using sampling, I'm going to use full order log from NYSE (sample data is available from NYSE FTP), just because I can easily do it with Spark. Instead of using SVM, I'm going to use Decision Tree algorithm for classification, because in Spark MLLib it supports multiclass classification out of the box. If you want to get deep understanding of the problem and proposed solution, you need to read the paper.