Goto

Collaborating Authors

 Support Vector Machines


Data Science Explains Why Every Hit Pop Song Sounds the Same

#artificialintelligence

There's a Nirvana song that you may not have heard that, ironically, describes why you have heard another Nirvana song, "Smells Like Teen Spirit," which dominated the airwaves in the early '90s and still endures today. It's called "Verse Chorus Verse" and it follows the song structure it's named for, which most pop songs, including "Teen Spirit" and recent smashes like "Old Town Road," rely on. The only weird thing, though, is that the song is about frontman Kurt Cobain's chronic stomach pain and the medications he illegally took. That title is a play on a common dig at pop songs--all of them sound the same. Now, two student researchers at the University of San Francisco have leveraged Spotify data to figure out if that's really true.


R Neural Network

#artificialintelligence

In the previous four posts I have used multiple linear regression, decision trees, random forest, gradient boosting, and support vector machine to predict MPG for 2019 vehicles. It was determined that svm produced the best model. In this post I am going to use the neuralnet package to fit a neural network to the cars_19 dataset. The raw data is located on the EPA government site. Similar to the other models, the variables/features I am using are: Engine displacement (size), number of cylinders, transmission type, number of gears, air inspired method, regenerative braking type, battery capacity Ah, drivetrain, fuel type, cylinder deactivate, and variable valve.


Machine Learning Algorithms Utilizing Functional Respiratory Imaging May Predict COPD Exacerbations

#artificialintelligence

A total of 11 baseline FRI parameters could significantly distinguish ( p 0.05) the development of AECOPD from a stable period. In contrast, no baseline clinical or pulmonary function test parameters allowed significant classification. Furthermore, using Support Vector Machines, an accuracy of 80.65% and positive predictive value of 82.35% could be obtained by combining baseline FRI features such as total specific image-based airway volume and total specific image-based airway resistance, measured at functional residual capacity. Patients who developed an AECOPD, showed significantly smaller airway volumes and (hence) significantly higher airway resistances at baseline.


Prediction of Overall Survival of Brain Tumor Patients

arXiv.org Machine Learning

--Automated brain tumor segmentation plays an important role in the diagnosis and prognosis of the patient. The main focus of this paper is to segment tumor from BRA TS 2018 benchmark dataset and use age, shape and volumetric features to predict overall survival of patients. The random forest classifier achieves overall survival accuracy of 59% on the test dataset and 67% on the dataset with resection status as gross total resection. The proposed approach uses fewer features but achieves better accuracy than state-of- the-art methods. Medical fraternity considers brain tumor amongst the most fatal type of cancer [1]. Brain tumors are divided into two categories based on origin and malignancy. Former is further classified as primary and secondary.


Student Performance Prediction with Optimum Multilabel Ensemble Model

arXiv.org Machine Learning

One of the important measures of quality of education is the performance of students in the academic settings. Nowadays, abundant data is stored in educational institutions about students which can help to discover insight on how students are learning and how to improve their performance ahead of time using data mining techniques. In this paper, we developed a student performance prediction model that predicts the performance of high school students for the next semester for five courses. We modeled our prediction system as a multi-label classification task and used support vector machine (SVM), Random Forest (RF), K-nearest Neighbors (KNN), and Mult-layer perceptron (MLP) as base-classifiers to train our model. We further improved the performance of the prediction model using state-of-the-art partitioning schemes to divide the label space into smaller spaces and use Label Powerset (LP) transformation method to transform each labelset into a multi-class classification task. The proposed model achieved better performance in terms of different evaluation metrics when compared to other multi-label learning tasks such as binary relevance and classifier chains.


Equalizing Recourse across Groups

arXiv.org Artificial Intelligence

The rise in machine learning-assisted decision-making has led to concerns about the fairness of the decisions and techniques to mitigate problems of discrimination. If a negative decision is made about an individual (denying a loan, rejecting an application for housing, and so on) justice dictates that we be able to ask how we might change circumstances to get a favorable decision the next time. Moreover, the ability to change circumstances (a better education, improved credentials) should not be limited to only those with access to expensive resources. In other words, \emph{recourse} for negative decisions should be considered a desirable value that can be equalized across (demographically defined) groups. This paper describes how to build models that make accurate predictions while still ensuring that the penalties for a negative outcome do not disadvantage different groups disproportionately. We measure recourse as the distance of an individual from the decision boundary of a classifier. We then introduce a regularized objective to minimize the difference in recourse across groups. We explore linear settings and further extend recourse to non-linear settings as well as model-agnostic settings where the exact distance from boundary cannot be calculated. Our results show that we can successfully decrease the unfairness in recourse while maintaining classifier performance.


Are Bitcoins price predictable? Evidence from machine learning techniques using technical indicators

arXiv.org Machine Learning

The uncertainties in future Bitcoin price make it difficult to accurately predict the price of Bitcoin. Accurately predicting the price for Bitcoin is therefore important for decision-making process of investors and market players in the cryptocurrency market. Using historical data from 01/01/2012 to 16/08/2019, machine learning techniques (Generalized linear model via penalized maximum likelihood, random forest, support vector regression with linear kernel, and stacking ensemble) were used to forecast the price of Bitcoin. The prediction models employed key and high dimensional technical indicators as the predictors. The performance of these techniques were evaluated using mean absolute percentage error (MAPE), root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R-squared). The performance metrics revealed that the stacking ensemble model with two base learner (random forest and generalized linear model via penalized maximum likelihood) and support vector regression with linear kernel as meta-learner was the optimal model for forecasting Bitcoin price. The MAPE, RMSE, MAE, and R-squared values for the stacking ensemble model were 0.0191%, 15.5331 USD, 124.5508 USD, and 0.9967 respectively. These values show a high degree of reliability in predicting the price of Bitcoin using the stacking ensemble model. Accurately predicting the future price of Bitcoin will yield significant returns for investors and market players in the cryptocurrency market.


Data Science News This Week (2019-08-31)

#artificialintelligence

While machine learning today is dominated by deep neural network research, in the 1990s neural approaches were not recognized as reliable for real-world applications. Back then, researchers put their efforts into kernel methods and support vector machines (SVM). This spreadsheet contains the ultimate list of open datasets for machine learning. Organized by industry and use case, this database contains a diverse range of 300 datasets to train machine learning models. In an exclusive interview, David Wood, Futurist, Chair of London Futurists, and Peter Jackson, Software Engineer member of London Futurists, share with us how Artificial Intelligence is going to impact the future of engineers' jobs and how to prepare for it.


Humans Don't Realize How Biased They Are Until AI Reproduces the Same Bias, Says UNESCO AI Chair

#artificialintelligence

While machine learning today is dominated by deep neural network research, in the 1990s neural approaches were not recognized as reliable for real-world applications. Back then, researchers put their efforts into kernel methods and support vector machines (SVM). One of the most notable and respected contributors to kernel methods and SVM is John Shawe-Taylor, a professor at University College London (UK) and Director of the Centre for Computational Statistics and Machine Learning (CSML). His main research area is Statistical Learning Theory, but his contributions range from neural networks to machine learning and graph theory. Shawe-Taylor has published over 300 papers with over 42000 citations.


Humans Don't Realize How Biased They Are Until AI Reproduces the Same Bias, Says UNESCO AI Chair

#artificialintelligence

While machine learning today is dominated by deep neural network research, in the 1990s neural approaches were not recognized as reliable for real-world applications. Back then, researchers put their efforts into kernel methods and support vector machines (SVM). One of the most notable and respected contributors to kernel methods and SVM is John Shawe-Taylor, a professor at University College London (UK) and Director of the Centre for Computational Statistics and Machine Learning (CSML). His main research area is Statistical Learning Theory, but his contributions range from neural networks to machine learning and graph theory. Shawe-Taylor has published over 300 papers with over 42000 citations.