Goto

Collaborating Authors

 Support Vector Machines


Data Scientist (Remote)

#artificialintelligence

Finite Markov Decision Processes, Support Vector Machines, Q-Learning, Stochastic Finite State Machines, MCTS or other hybrid Deep Reinforcement Learning processes W2 Benefits Not only you get to join our team of awesome playful ninjas, we also have great benefits: Six weeks paid time off per year (PTO Holidays).


Constrained Classification and Policy Learning

arXiv.org Machine Learning

Modern machine learning approaches to classification, including AdaBoost, support vector machines, and deep neural networks, utilize surrogate loss techniques to circumvent the computational complexity of minimizing empirical classification risk. These techniques are also useful for causal policy learning problems, since estimation of individualized treatment rules can be cast as a weighted (cost-sensitive) classification problem. Consistency of the surrogate loss approaches studied in Zhang (2004) and Bartlett et al. (2006) crucially relies on the assumption of correct specification, meaning that the specified set of classifiers is rich enough to contain a first-best classifier. This assumption is, however, less credible when the set of classifiers is constrained by interpretability or fairness, leaving the applicability of surrogate loss based algorithms unknown in such second-best scenarios. This paper studies consistency of surrogate loss procedures under a constrained set of classifiers without assuming correct specification. We show that in the setting where the constraint restricts the classifier's prediction set only, hinge losses (i.e., $\ell_1$-support vector machines) are the only surrogate losses that preserve consistency in second-best scenarios. If the constraint additionally restricts the functional form of the classifier, consistency of a surrogate loss approach is not guaranteed even with hinge loss. We therefore characterize conditions for the constrained set of classifiers that can guarantee consistency of hinge risk minimizing classifiers. Exploiting our theoretical results, we develop robust and computationally attractive hinge loss based procedures for a monotone classification problem.


Multi-Class Classification of Blood Cells -- End to End Computer Vision based diagnosis case study

arXiv.org Machine Learning

The diagnosis of blood-based diseases often involves identifying and characterizing patient blood samples. Automated methods to detect and classify blood cell subtypes have important medical applications. Automated medical image processing and analysis offers a powerful tool for medical diagnosis. In this work we tackle the problem of white blood cell classification based on the morphological characteristics of their outer contour, color. The work we would explore a set of preprocessing and segmentation (Color-based segmentation, Morphological processing, contouring) algorithms along with a set of features extraction methods (Corner detection algorithms and Histogram of Gradients (HOG)), dimentionality reduction algorithms (Principal Component Analysis (PCA)) that are able to recognize and classify through various Unsupervised (k-nearest neighbors) and Supervised (Support Vector Machine, Decision Trees, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naรฏve Bayes) algorithms different categories of white blood cells to Eosinophil, Lymphocyte, Monocyte, and Neutrophil. We even take a step forwards to explore various Deep Convolutional Neural network architecture (Sqeezent, MobilenetV1, MobilenetV2, InceptionNet etc.) without preprocessing/segmentation and with preprocessing. We would like to explore many algorithms to identify the robust algorithm with least time complexity and low resource requirement. The outcome of this work can be a cue to selection of algorithms as per requirement for automated blood cell classification.


Prediction of repurposed drugs for Coronaviruses using artificial intelligence and machine learning - PubMed

#artificialintelligence

The world is facing the COVID-19 pandemic caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Likewise, other viruses of the Coronaviridae family were responsible for causing epidemics earlier. To tackle these viruses, there is a lack of approved antiviral drugs. Therefore, we have developed robust computational methods to predict the repurposed drugs using machine learning techniques namely Support Vector Machine, Random Forest, k-Nearest Neighbour, Artificial Neural Network, and Deep Learning. We used the experimentally validated drugs/chemicals with anticorona activity (IC50/EC50) from'DrugRepV' repository.


Robust Regression Revisited: Acceleration and Improved Estimation Rates

arXiv.org Machine Learning

We study fast algorithms for statistical regression problems under the strong contamination model, where the goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples. Prior works in this line of research were based on the robust gradient descent framework of Prasad et. al., a first-order method using biased gradient queries, or the Sever framework of Diakonikolas et. al., an iterative outlier-removal method calling a stationary point finder. We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees compared to the state-of-the-art. For the general case of smooth GLMs (e.g. logistic regression), we show that the robust gradient descent framework of Prasad et. al. can be accelerated, and show our algorithm extends to optimizing the Moreau envelopes of Lipschitz GLMs (e.g. support vector machines), answering several open questions in the literature. For the well-studied case of robust linear regression, we present an alternative approach obtaining improved estimation rates over prior nearly-linear time algorithms. Interestingly, our method starts with an identifiability proof introduced in the context of the sum-of-squares algorithm of Bakshi and Prasad, which achieved optimal error rates while requiring large polynomial runtime and sample complexity. We reinterpret their proof within the Sever framework and obtain a dramatically faster and more sample-efficient algorithm under fewer distributional assumptions.


Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

arXiv.org Machine Learning

The growing literature on "benign overfitting" in overparameterized models has been mostly restricted to regression or binary classification settings; however, most success stories of modern machine learning have been recorded in multiclass settings. Motivated by this discrepancy, we study benign overfitting in multiclass linear classification. Specifically, we consider the following popular training algorithms on separable data: (i) empirical risk minimization (ERM) with cross-entropy loss, which converges to the multiclass support vector machine (SVM) solution; (ii) ERM with least-squares loss, which converges to the min-norm interpolating (MNI) solution; and, (iii) the one-vs-all SVM classifier. First, we provide a simple sufficient condition under which all three algorithms lead to classifiers that interpolate the training data and have equal accuracy. When the data is generated from Gaussian mixtures or a multinomial logistic model, this condition holds under high enough effective overparameterization. Second, we derive novel error bounds on the accuracy of the MNI classifier, thereby showing that all three training algorithms lead to benign overfitting under sufficient overparameterization. Ultimately, our analysis shows that good generalization is possible for SVM solutions beyond the realm in which typical margin-based bounds apply.


Support Vector Machine: Introduction - Analytics Vidhya

#artificialintelligence

In this article, we will be discussing Support Vector Machines. Before we proceed, I hope you already have some prior knowledge about Linear Regression and Logistic Regression. If you want to learn Logistic Regression, you can click here. You can also check its implementation here. By the end of this article., you will get to know the basics involved in the Support Vector Machine.


Support Vector Machines in Python

#artificialintelligence

Please consider watching this video if any section of this article is unclear. How to set up your programming environment can be found at the start of: Episode 4.3 We can now use the support vector machine to classify apples and oranges given the fruit's weight and size. For example -- let's say we recorded a fruit to have a weight of 70 grams and size of 4.6cm. We obtain a prediction of this fruit being an orange. Looking at the graph in the scatterplot above we note the recording of 70 grams and size of 4.6cm lies below the hyperplane, hence an orange is predicted.


EMG Signal Classification Using Reflection Coefficients and Extreme Value Machine

arXiv.org Artificial Intelligence

Electromyography is a promising approach to the gesture recognition of humans if an efficient classifier with high accuracy is available. In this paper, we propose to utilize Extreme Value Machine (EVM) as a high-performance algorithm for the classification of EMG signals. We employ reflection coefficients obtained from an Autoregressive (AR) model to train a set of classifiers. Our experimental results indicate that EVM has better accuracy in comparison to the conventional classifiers approved in the literature based on K-Nearest Neighbors (KNN) and Support Vector Machine (SVM).


Support Vector Machine with Kernel

#artificialintelligence

Step by step: Example from banking.csv data set. In machine learning there are two types of algorithms. So, the SVM is a very powerful and flexible supervised machine learning algorithm. Today with the growth of machine learning applications the Support vector machines are used in large amount of applications. The major advantage why we using SVMs on above application is it can handle both "classification" and "regression" on both "linear" and "non-linear" data.