Goto

Collaborating Authors

 Support Vector Machines


Shot-frugal and Robust quantum kernel classifiers

arXiv.org Artificial Intelligence

Quantum kernel methods are a candidate for quantum speed-ups in supervised machine learning. The number of quantum measurements N required for a reasonable kernel estimate is a critical resource, both from complexity considerations and because of the constraints of near-term quantum hardware. We emphasize that for classification tasks, the aim is reliable classification and not precise kernel evaluation, and demonstrate that the former is far more resource efficient. Furthermore, it is shown that the accuracy of classification is not a suitable performance metric in the presence of noise and we motivate a new metric that characterizes the reliability of classification. We then obtain a bound for N which ensures, with high probability, that classification errors over a dataset are bounded by the margin errors of an idealized quantum kernel classifier. Using chance constraint programming and the subgaussian bounds of quantum kernel distributions, we derive several Shot-frugal and Robust (ShofaR) programs starting from the primal formulation of the Support Vector Machine. This significantly reduces the number of quantum measurements needed and is robust to noise by construction. Our strategy is applicable to uncertainty in quantum kernels arising from any source of unbiased noise.


Quantifying intra-tumoral genetic heterogeneity of glioblastoma toward precision medicine using MRI and a data-inclusive machine learning algorithm

arXiv.org Artificial Intelligence

Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic selection to improve patient outcomes. We proposed a novel Weakly Supervised Ordinal Support Vector Machine (WSO-SVM) to predict regional genetic alteration status within each GBM tumor using MRI. WSO-SVM was applied to a unique dataset of 318 image-localized biopsies with spatially matched multiparametric MRI from 74 GBM patients. The model was trained to predict the regional genetic alteration of three GBM driver genes (EGFR, PDGFRA, and PTEN) based on features extracted from the corresponding region of five MRI contrast images. For comparison, a variety of existing ML algorithms were also applied. The classification accuracy of each gene was compared between the different algorithms. The SHapley Additive exPlanations (SHAP) method was further applied to compute contribution scores of different contrast images. Finally, the trained WSO-SVM was used to generate prediction maps within the tumoral area of each patient to help visualize the intra-tumoral genetic heterogeneity. This study demonstrated the feasibility of using MRI and WSO-SVM to enable non-invasive prediction of intra-tumoral regional genetic alteration for each GBM patient, which can inform future adaptive therapies for individualized oncology.


Voting-based Multimodal Automatic Deception Detection

arXiv.org Artificial Intelligence

Automatic Deception Detection has been a hot research topic for a long time, using machine learning and deep learning to automatically detect deception, brings new light to this old field. In this paper, we proposed a voting-based method for automatic deception detection from videos using audio, visual and lexical features. Experiments were done on two datasets, the Real-life trial dataset by Michigan University and the Miami University deception detection dataset. Video samples were split into frames of images, audio, and manuscripts. Our Voting-based Multimodal proposed solution consists of three models. The first model is CNN for detecting deception from images, the second model is Support Vector Machine (SVM) on Mel spectrograms for detecting deception from audio and the third model is Word2Vec on Support Vector Machine (SVM) for detecting deception from manuscripts. Our proposed solution outperforms state of the art. Best results achieved on images, audio and text were 97%, 96%, 92% respectively on Real-Life Trial Dataset, and 97%, 82%, 73% on video, audio and text respectively on Miami University Deception Detection.


Nowcasting Madagascar's real GDP using machine learning algorithms

arXiv.org Artificial Intelligence

We investigate the predictive power of different machine learning algorithms to nowcast Madagascar's gross domestic product (GDP). We trained popular regression models, including linear regularized regression (Ridge, Lasso, Elastic-net), dimensionality reduction model (principal component regression), k-nearest neighbors algorithm (k-NN regression), support vector regression (linear SVR), and tree-based ensemble models (Random forest and XGBoost regressions), on 10 Malagasy quarterly macroeconomic leading indicators over the period 2007Q1--2022Q4, and we used simple econometric models as a benchmark. We measured the nowcast accuracy of each model by calculating the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Our findings reveal that the Ensemble Model, formed by aggregating individual predictions, consistently outperforms traditional econometric models. We conclude that machine learning models can deliver more accurate and timely nowcasts of Malagasy economic performance and provide policymakers with additional guidance for data-driven decision making.


Data Classification With Multiprocessing

arXiv.org Artificial Intelligence

Classification is one of the most important tasks in Machine Learning (ML) and with recent advancements in artificial intelligence (AI) it is important to find efficient ways to implement it. Generally, the choice of classification algorithm depends on the data it is dealing with, and accuracy of the algorithm depends on the hyperparameters it is tuned with. One way is to check the accuracy of the algorithms by executing it with different hyperparameters serially and then selecting the parameters that give the highest accuracy to predict the final output. This paper proposes another way where the algorithm is parallelly trained with different hyperparameters to reduce the execution time. In the end, results from all the trained variations of the algorithms are ensembled to exploit the parallelism and improve the accuracy of prediction. Python multiprocessing is used to test this hypothesis with different classification algorithms such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), random forest and decision tree and reviews factors affecting parallelism. Ensembled output considers the predictions from all processes and final class is the one predicted by maximum number of processes. Doing this increases the reliability of predictions. We conclude that ensembling improves accuracy and multiprocessing reduces execution time for selected algorithms.


Multi-Objective Latent Space Optimization of Generative Molecular Design Models

arXiv.org Artificial Intelligence

Molecular design based on generative models, such as variational autoencoders (VAEs), has become increasingly popular in recent years due to its efficiency for exploring high-dimensional molecular space to identify molecules with desired properties. While the efficacy of the initial model strongly depends on the training data, the sampling efficiency of the model for suggesting novel molecules with enhanced properties can be further enhanced via latent space optimization. In this paper, we propose a multi-objective latent space optimization (LSO) method that can significantly enhance the performance of generative molecular design (GMD). The proposed method adopts an iterative weighted retraining approach, where the respective weights of the molecules in the training data are determined by their Pareto efficiency. We demonstrate that our multi-objective GMD LSO method can significantly improve the performance of GMD for jointly optimizing multiple molecular properties.


On support vector machines under a multiple-cost scenario

arXiv.org Machine Learning

Support Vector Machine (SVM) is a powerful tool in binary classification, known to attain excellent misclassification rates. On the other hand, many realworld classification problems, such as those found in medical diagnosis, churn or fraud prediction, involve misclassification costs which may be different in the different classes. However, it may be hard for the user to provide precise values for such misclassification costs, whereas it may be much easier to identify acceptable misclassification rates values. In this paper we propose a novel SVM model in which misclassification costs are considered by incorporating performance constraints in the problem formulation. Specifically, our aim is to seek the hyperplane with maximal margin yielding misclassification rates below given threshold values. Such maximal margin hyperplane is obtained by solving a quadratic convex problem with linear constraints and integer variables. The reported numerical experience shows that our model gives the user control on the misclassification rates in one class (possibly at the expense of an increase in misclassification rates for the other class) and is feasible in terms of running times.


Learning Human-like Representations to Enable Learning Human Values

arXiv.org Artificial Intelligence

How can we build AI systems that are aligned with human values and objectives in order to avoid causing harm or violating societal standards for acceptable behavior? Making AI systems learn human-like representations of the world has many known benefits, including improving generalization, robustness to domain shifts, and few-shot learning performance, among others. We propose that this kind of representational alignment between machine learning (ML) models and humans is also a necessary condition for value alignment, where ML systems conform to human values and societal norms. We focus on ethics as one aspect of value alignment and train multiple ML agents (support vector regression and kernel regression) in a multi-armed bandit setting, where rewards are sampled from a distribution that reflects the morality of the chosen action. We then study the relationship between each agent's degree of representational alignment with humans and their performance when learning to take the most ethical actions.


Sign Language Conversation Interpretation Using Wearable Sensors and Machine Learning

arXiv.org Artificial Intelligence

The count of people suffering from various levels of hearing loss reached 1.57 billion in 2019. This huge number tends to suffer on many personal and professional levels and strictly needs to be included with the rest of society healthily. This paper presents a proof of concept of an automatic sign language recognition system based on data obtained using a wearable device of 3 flex sensors. The system is designed to interpret a selected set of American Sign Language (ASL) dynamic words by collecting data in sequences of the performed signs and using machine learning methods. The built models achieved high-quality performances, such as Random Forest with 99% accuracy, Support Vector Machine (SVM) with 99%, and two K-Nearest Neighbor (KNN) models with 98%. This indicates many possible paths toward the development of a full-scale system.


The Validity of a Machine Learning-Based Video Game in the Objective Screening of Attention Deficit Hyperactivity Disorder in Children Aged 5 to 12 Years

arXiv.org Artificial Intelligence

This research was conducted with financial support from the Javaneh Program of the Ministry of Science, Research, and Technology of the Islamic Republic of Iran, and the Cognitive Sciences and Technologies Council of the Islamic Republic of Iran. Correspondence concerning this article should be addressed to Hadi Moradi, Department of Robotics and Artificial Intelligence, University of Tehran, Tehran, Iran. Abstract Objective: Early identification of ADHD is necessary to provide the opportunity for timely treatment. However, screening the symptoms of ADHD on a large scale is not easy. This study aimed to validate a video game (FishFinder) for the screening of ADHD using objective measurement of the core symptoms of this disorder. Method: The FishFinder measures attention and impulsivity through in-game performance and evaluates the child's hyperactivity using smartphone motion sensors. This game was tested on 26 children with ADHD and 26 healthy children aged 5 to 12 years. A Support Vector Machine was employed to detect children with ADHD. Conclusions: The FishFinder demonstrated a strong ability to identify ADHD in children. So, this game can be used as an affordable, accessible, and enjoyable method for the objective screening of ADHD. The Validity of a Machine Learning-Based Video Game in the Objective Screening of Attention Deficit Hyperactivity Disorder in Children Aged 5 to 12 Years Attention Deficit Hyperactivity Disorder (ADHD) is one of the most common childhood disorders with a prevalence of about 7.2% (Thomas et al., 2015).