Goto

Collaborating Authors

 Support Vector Machines


Cryptocurrency Price Prediction and Trading Strategies Using Support Vector Machines

arXiv.org Machine Learning

Few assets in financial history have been as notoriously volatile as cryptocurrencies. While the long term outlook for this asset class remains unclear, we are successful in making short term price predictions for several major crypto assets. Using historical data from July 2015 to November 2019, we develop a large number of technical indicators to capture patterns in the cryptocurrency market. We then test various classification methods to forecast short-term future price movements based on these indicators. On both PPV and NPV metrics, our classifiers do well in identifying up and down market moves over the next 1 hour. Beyond evaluating classification accuracy, we also develop a strategy for translating 1-hour-ahead class predictions into trading decisions, along with a backtester that simulates trading in a realistic environment. We find that support vector machines yield the most profitable trading strategies, which outperform the market on average for Bitcoin, Ethereum and Litecoin over the past 22 months, since January 2018.


An Efficient Machine Learning-based Elderly Fall Detection Algorithm

arXiv.org Machine Learning

Falling is a commonly occurring mishap with elderly people, which may cause serious injuries. Thus, rapid fall detection is very important in order to mitigate the severe effects of fall among the elderly people. Many fall monitoring systems based on the accelerometer have been proposed for the fall detection. However, many of them mistakenly identify the daily life activities as fall or fall as daily life activity. To this aim, an efficient machine learning-based fall detection algorithm has been proposed in this paper. The proposed algorithm detects fall with efficient sensitivity, specificity, and accuracy as compared to the state-of-the-art techniques. A publicly available dataset with a very simple and computationally efficient set of features is used to accurately detect the fall incident. The proposed algorithm reports and accuracy of 99.98% with the Support Vector Machine(SVM) classifier.


Defending Against Adversarial Machine Learning

arXiv.org Artificial Intelligence

An Adversarial System to attack and an Authorship Attribution System (AAS) to defend itself against the attacks are analyzed. Defending a system against attacks from an adversarial machine learner can be done by randomly switching between models for the system, by detecting and reacting to changes in the distribution of normal inputs, or by using other methods. Adversarial machine learning is used to identify a system that is being used to map system inputs to outputs. Three types of machine learners are using for the model that is being attacked. The machine learners that are used to model the system being attacked are a Radial Basis Function Support Vector Machine, a Linear Support Vector Machine, and a Feedforward Neural Network. The feature masks are evolved using accuracy as the fitness measure. The system defends itself against adversarial machine learning attacks by identifying inputs that do not match the probability distribution of normal inputs. The system also defends itself against adversarial attacks by randomly switching between the feature masks being used to map system inputs to outputs.


A Self-Adaptive Synthetic Over-Sampling Technique for Imbalanced Classification

arXiv.org Artificial Intelligence

Traditionally, in supervised machine learning, (a significant) part of the available data (usually 50% to 80%) is used for training and the rest for validation. In many problems, however, the data is highly imbalanced in regard to different classes or does not have good coverage of the feasible data space which, in turn, creates problems in validation and usage phase. In this paper, we propose a technique for synthesising feasible and likely data to help balance the classes as well as to boost the performance in terms of confusion matrix as well as overall. The idea, in a nutshell, is to synthesise data samples in close vicinity to the actual data samples specifically for the less represented (minority) classes. This has also implications to the so-called fairness of machine learning. In this paper, we propose a specific method for synthesising data in a way to balance the classes and boost the performance, especially of the minority classes. It is generic and can be applied to different base algorithms, e.g. support vector machine, k-nearest neighbour, deep networks, rule-based classifiers, decision trees, etc. The results demonstrated that: i) a significantly more balanced (and fair) classification results can be achieved; ii) that the overall performance as well as the performance per class measured by confusion matrix can be boosted. In addition, this approach can be very valuable for the cases when the number of actual available labelled data is small which itself is one of the problems of the contemporary machine learning.


Sparse $\ell_1$ and $\ell_2$ Center Classifiers

arXiv.org Machine Learning

The nearest-centroid classifier is a simple linear-time classifier based on computing the centroids of the data classes in the training phase, and then assigning a new datum to the class corresponding to its nearest centroid. Thanks to its very low computational cost, the nearest-centroid classifier is still widely used in machine learning, despite the development of many other more sophisticated classification methods. In this paper, we propose two sparse variants of the nearest-centroid classifier, based respectively on $\ell_1$ and $\ell_2$ distance criteria. The proposed sparse classifiers perform simultaneous classification and feature selection, by detecting the features that are most relevant for the classification purpose. We show that training of the proposed sparse models, with both distance criteria, can be performed exactly (i.e., the globally optimal set of features is selected) and at a quasi-linear computational cost. The experimental results show that the proposed methods are competitive in accuracy with state-of-the-art feature selection techniques, while having a significantly lower computational cost.


Fast Polynomial Kernel Classification for Massive Data

arXiv.org Machine Learning

In the era of big data, it is highly desired to develop efficient machine learning algorithms to tackle massive data challenges such as storage bottleneck, algorithmic scalability, and interpretability. In this paper, we develop a novel efficient classification algorithm, called fast polynomial kernel classification (FPC), to conquer the scalability and storage challenges. Our main tools are a suitable selected feature mapping based on polynomial kernels and an alternating direction method of multipliers (ADMM) algorithm for a related non-smooth convex optimization problem. Fast learning rates as well as feasibility verifications including the convergence of ADMM and the selection of center points are established to justify theoretical behaviors of FPC. Our theoretical assertions are verified by a series of simulations and real data applications. The numerical results demonstrate that FPC significantly reduces the computational burden and storage memory of the existing learning schemes such as support vector machines and boosting, without sacrificing their generalization abilities much.


An easy guide to choose the right Machine Learning algorithm for your task

#artificialintelligence

Well, there is no straightforward and sure-shot answer to this question. The answer depends on many factors like the problem statement and the kind of output you want, type and size of the data, the available computational time, number of features and observations in the data, to name a few. It is usually recommended to gather a good amount of data to get reliable predictions. However, many a time the availability of data is a constraint. So, if the training data is smaller or if the dataset has a fewer number of observations and a higher number of features like genetics or textual data, choose algorithms with high bias/low variance like Linear regression, Naïve Bayes, Linear SVM.


Use of Artificial Intelligence to Analyse Risk in Legal Documents for a Better Decision Support

arXiv.org Artificial Intelligence

Assessing risk for voluminous legal documents such as request for proposal; contracts is tedious and error prone. We have developed "risk-o-meter", a framework, based on machine learning and natural language processing to review and assess risks of any legal document. Our framework uses Paragraph Vector, an unsupervised model to generate vector representation of text. This enables the framework to learn contextual relations of legal terms and generate sensible context aware embedding. The framework then feeds the vector space into a supervised classification algorithm to predict whether a paragraph belongs to a per-defined risk category or not. The framework thus extracts risk prone paragraphs. This technique efficiently overcomes the limitations of keyword-based search. We have achieved an accuracy of 91% for the risk category having the largest training dataset. This framework will help organizations optimize effort to identify risk from large document base with minimal human intervention and thus will help to have risk mitigated sustainable growth. Its machine learning capability makes it scalable to uncover relevant information from any type of document apart from legal documents, provided the library is per-populated and rich.


Random Machines: A bagged-weighted support vector model with free kernel choice

arXiv.org Machine Learning

Improvement of statistical learning models in order to increase efficiency in solving classification or regression problems is still a goal pursued by the scientific community. In this way, the support vector machine model is one of the most successful and powerful algorithms for those tasks. However, its performance depends directly from the choice of the kernel function and their hyperparameters. The traditional choice of them, actually, can be computationally expensive to do the kernel choice and the tuning processes. In this article, it is proposed a novel framework to deal with the kernel function selection called Random Machines. The results improved accuracy and reduced computational time. The data study was performed in simulated data and over 27 real benchmarking datasets.


Imputing missing values with unsupervised random trees

arXiv.org Machine Learning

When designing statistical models from tabular data for supervised learning tasks such as regression or classification, oftentimes it happens that some of th e observations available for fitting such models are missing values in one or more variables, usually d ue to reasons such as poor data collection practices, loss of information, participants dropping out of a survey, or similar. Many methods such as [2] or [4] overcome this issue by using heuristics to handle missing information - decision tree methods in particular, due to their splitting nature that takes one variable at a time, are particularly well suited for implicit han dling of missing data without a-priori imputation ([16]), but other methods such as gene ralized linear models or support vector machines cannot handle missing values in the same wa y, and when using them on a dataset with missing entries, these entries have to either be dr opped or imputed. Typical strategies for imputing the missing entries include: replacing them with the column mean or median, determining the most similar observations (nearest neighbors) according to the non-missing variables and taking a simple or weighted average of the m issing variable(s) from them ([11]), producing a latent representation of the data by some low-rank matrix factorization that minimizes errors on the non-missing entries and from which the m issing entries are then reconstructed ([10]), and iterative imputation that starts with so me basic imputation for all values and then cycles through each variable by constructing a mod el to predict the missing values from the non-missing observations, replacing the earlier impu tation with the model prediction and repeating until convergence ([3], [18]).