Goto

Collaborating Authors

 Support Vector Machines


Travel time prediction for congested freeways with a dynamic linear model

arXiv.org Machine Learning

Accurate prediction of travel time is an essential feature to support Intelligent Transportation Systems (ITS). The non-linearity of traffic states, however, makes this prediction a challenging task. Here we propose to use dynamic linear models (DLMs) to approximate the non-linear traffic states. Unlike a static linear regression model, the DLMs assume that their parameters are changing across time. We design a DLM with model parameters defined at each time unit to describe the spatio-temporal characteristics of time-series traffic data. Based on our DLM and its model parameters analytically trained using historical data, we suggest an optimal linear predictor in the minimum mean square error (MMSE) sense. We compare our prediction accuracy of travel time for freeways in California (I210-E and I5-S) under highly congested traffic conditions with those of other methods: the instantaneous travel time, k-nearest neighbor, support vector regression, and artificial neural network. We show significant improvements in the accuracy, especially for short-term prediction.


Programming Fairness in Algorithms

#artificialintelligence

Being good is easy, what is difficult is being just. We need to defend the interests of those whom we've never met and never will. Note: This article is intended for a general audience to try and elucidate the complicated nature of unfairness in machine learning algorithms. As such, I have tried to explain concepts in an accessible way with minimal use of mathematics, in the hope that everyone can get something out of reading this. Supervised machine learning algorithms are inherently discriminatory. They are discriminatory in the sense that they use information embedded in the features of data to separate instances into distinct categories -- indeed, this is their designated purpose in life. This is reflected in the name for these algorithms which are often referred to as discriminative algorithms (splitting data into categories), in contrast to generative algorithms (generating data from a given category). When we use supervised machine learning, this "discrimination" is used as an aid to help us categorize our data into distinct categories within the data distribution, as illustrated below. Whilst this occurs when we apply discriminative algorithms -- such as support vector machines, forms of parametric regression (e.g. For example, using last week's weather data to try and predict the weather tomorrow has no moral valence attached to it.


Support Vector Machines (SVM) and its Python implementation

#artificialintelligence

The support vector machines algorithm is a supervised machine learning algorithm that can be used for both classification and regression. In this article, we will be discussing certain parameters concerning the support vector machines and try to understand this algorithm in detail. For understanding, let us consider the SVM used for classification. The following figure shows the geometrical representation of the SVM classification. After taking a look at the above diagram you might notice that the SVM classifies the data a bit differently as compared to the other algorithms.


An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data

arXiv.org Machine Learning

In the era of big data, a large number of text data generated by the Internet has given birth to a variety of text representation methods. In natural language processing (NLP), text representation transforms text into vectors that can be processed by computer without losing the original semantic information. However, these methods are difficult to effectively extract the semantic features among words and distinguish polysemy in language. Therefore, a text feature representation model based on convolutional neural network (CNN) and variational autoencoder (VAE) is proposed to extract the text features and apply the obtained text feature representation on the text classification tasks. CNN is used to extract the features of text vector to get the semantics among words and VAE is introduced to make the text feature space more consistent with Gaussian distribution. In addition, the output of the improved word2vec model is employed as the input of the proposed model to distinguish different meanings of the same word in different contexts. The experimental results show that the proposed model outperforms in k-nearest neighbor (KNN), random forest (RF) and support vector machine (SVM) classification algorithms.


How is Machine Learning Useful for Macroeconomic Forecasting?

arXiv.org Machine Learning

We move beyond "Is Machine Learning Useful for Macroeconomic Forecasting?" by adding the "how". The current forecasting literature has focused on matching specific variables and horizons with a particularly successful algorithm. In contrast, we study the usefulness of the underlying features driving ML gains over standard macroeconometric methods. We distinguish four so-called features (nonlinearities, regularization, cross-validation and alternative loss function) and study their behavior in both the data-rich and data-poor environments. To do so, we design experiments that allow to identify the "treatment" effects of interest. We conclude that (i) nonlinearity is the true game changer for macroeconomic prediction, (ii) the standard factor model remains the best regularization, (iii) K-fold cross-validation is the best practice and (iv) the $L_2$ is preferred to the $\bar \epsilon$-insensitive in-sample loss. The forecasting gains of nonlinear techniques are associated with high macroeconomic uncertainty, financial stress and housing bubble bursts. This suggests that Machine Learning is useful for macroeconomic forecasting by mostly capturing important nonlinearities that arise in the context of uncertainty and financial frictions.


Stochastic Adaptive Line Search for Differentially Private Optimization

arXiv.org Machine Learning

The performance of private gradient-based optimization algorithms is highly dependent on the choice of step size (or learning rate) which often requires non-trivial amount of tuning. In this paper, we introduce a stochastic variant of classic backtracking line search algorithm that satisfies R\'enyi differential privacy. Specifically, the proposed algorithm adaptively chooses the step size satsisfying the the Armijo condition (with high probability) using noisy gradients and function estimates. Furthermore, to improve the probability with which the chosen step size satisfies the condition, it adjusts per-iteration privacy budget during runtime according to the reliability of noisy gradient. A naive implementation of the backtracking search algorithm may end up using unacceptably large privacy budget as the ability of adaptive step size selection comes at the cost of extra function evaluations. The proposed algorithm avoids this problem by using the sparse vector technique combined with the recent privacy amplification lemma. We also introduce a privacy budget adaptation strategy in which the algorithm adaptively increases the budget when it detects that directions pointed by consecutive gradients are drastically different. Extensive experiments on both convex and non-convex problems show that the adaptively chosen step sizes allow the proposed algorithm to efficiently use the privacy budget and show competitive performance against existing private optimizers.


How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18 search algorithms

arXiv.org Machine Learning

SVM with an RBF kernel is usually one of the best classification algorithms for most data sets, but it is important to tune the two hyperparameters $C$ and $\gamma$ to the data itself. In general, the selection of the hyperparameters is a non-convex optimization problem and thus many algorithms have been proposed to solve it, among them: grid search, random search, Bayesian optimization, simulated annealing, particle swarm optimization, Nelder Mead, and others. There have also been proposals to decouple the selection of $\gamma$ and $C$. We empirically compare 18 of these proposed search algorithms (with different parameterizations for a total of 47 combinations) on 115 real-life binary data sets. We find (among other things) that trees of Parzen estimators and particle swarm optimization select better hyperparameters with only a slight increase in computation time with respect to a grid search with the same number of evaluations. We also find that spending too much computational effort searching the hyperparameters will not likely result in better performance for future data and that there are no significant differences among the different procedures to select the best set of hyperparameters when more than one is found by the search algorithms.


[D] Speeding Up SVM by 120X and more!

#artificialintelligence

Support Vector Machines can be a lot slow to run on large datasets. With more data, the speedup increases proportionally which is great for use. Thundersvm also runs with support vector regression and a bunch more stuff. You can check out there github repo here. I have written an article on how to install and use thundersvm.


Context-Dependent Implicit Authentication for Wearable Device User

arXiv.org Machine Learning

As market wearables are becoming popular with a range of services, including making financial transactions, accessing cars, etc. that they provide based on various private information of a user, security of this information is becoming very important. However, users are often flooded with PINs and passwords in this internet of things (IoT) world. Additionally, hard-biometric, such as facial or finger recognition, based authentications are not adaptable for market wearables due to their limited sensing and computation capabilities. Therefore, it is a time demand to develop a burden-free implicit authentication mechanism for wearables using the less-informative soft-biometric data that are easily obtainable from the market wearables. In this work, we present a context-dependent soft-biometric-based wearable authentication system utilizing the heart rate, gait, and breathing audio signals. From our detailed analysis, we find that a binary support vector machine (SVM) with radial basis function (RBF) kernel can achieve an average accuracy of $0.94 \pm 0.07$, $F_1$ score of $0.93 \pm 0.08$, an equal error rate (EER) of about $0.06$ at a lower confidence threshold of 0.52, which shows the promise of this work.


SVMs in One Picture

#artificialintelligence

SVMs (Support Vector Machines) are a way to classify data by finding the optimal plane or hyperplane that separates the data. In 2D, the separation is a plane; In higher dimensions, it's a hyperplane. For simplicity, the following picture shows how SVM works for a two-dimensional set.