Support Vector Machines
A model-free approach to fingertip slip and disturbance detection for grasp stability inference
Kitouni, Dounia, Khoramshahi, Mahdi, Perdereau, Veronique
Robotic capacities in object manipulation are incomparable to those of humans. Besides years of learning, humans rely heavily on the richness of information from physical interaction with the environment. In particular, tactile sensing is crucial in providing such rich feedback. Despite its potential contributions to robotic manipulation, tactile sensing is less exploited; mainly due to the complexity of the time series provided by tactile sensors. In this work, we propose a method for assessing grasp stability using tactile sensing. More specifically, we propose a methodology to extract task-relevant features and design efficient classifiers to detect object slippage with respect to individual fingertips. We compare two classification models: support vector machine and logistic regression. We use highly sensitive Uskin tactile sensors mounted on an Allegro hand to test and validate our method. Our results demonstrate that the proposed method is effective in slippage detection in an online fashion.
Quantum-Enhanced Support Vector Machine for Large-Scale Stellar Classification with GPU Acceleration
Chen, Kuan-Cheng, Xu, Xiaotian, Makhanov, Henry, Chung, Hui-Hsuan, Liu, Chen-Yu
In this study, we introduce an innovative Quantum-enhanced Support Vector Machine (QSVM) approach for stellar classification, leveraging the power of quantum computing and GPU acceleration. Our QSVM algorithm significantly surpasses traditional methods such as K-Nearest Neighbors (KNN) and Logistic Regression (LR), particularly in handling complex binary and multi-class scenarios within the Harvard stellar classification system. The integration of quantum principles notably enhances classification accuracy, while GPU acceleration using the cuQuantum SDK ensures computational efficiency and scalability for large datasets in quantum simulators. This synergy not only accelerates the processing process but also improves the accuracy of classifying diverse stellar types, setting a new benchmark in astronomical data analysis. Our findings underscore the transformative potential of quantum machine learning in astronomical research, marking a significant leap forward in both precision and processing speed for stellar classification. This advancement has broader implications for astrophysical and related scientific fields
Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks
Rose, Evan, Suya, Fnu, Evans, David
Machine learning is susceptible to poisoning attacks, in which an attacker controls a small fraction of the training data and chooses that data with the goal of inducing some behavior unintended by the model developer in the trained model. We consider a realistic setting in which the adversary with the ability to insert a limited number of data points attempts to control the model's behavior on a specific subpopulation. Inspired by previous observations on disparate effectiveness of random label-flipping attacks on different subpopulations, we investigate the properties that can impact the effectiveness of state-of-the-art poisoning attacks against different subpopulations. For a family of 2-dimensional synthetic datasets, we empirically find that dataset separability plays a dominant role in subpopulation vulnerability for less separable datasets. However, well-separated datasets exhibit more dependence on individual subpopulation properties. We further discover that a crucial subpopulation property is captured by the difference in loss on the clean dataset between the clean model and a target model that misclassifies the subpopulation, and a subpopulation is much easier to attack if the loss difference is small. This property also generalizes to high-dimensional benchmark datasets. For the Adult benchmark dataset, we show that we can find semantically-meaningful subpopulation properties that are related to the susceptibilities of a selected group of subpopulations. The results in this paper are accompanied by a fully interactive web-based visualization of subpopulation poisoning attacks found at https://uvasrg.github.io/visualizing-poisoning
A benchmark of categorical encoders for binary classification
Matteucci, Federico, Arzamasov, Vadim, Boehm, Klemens
Categorical encoders transform categorical features into numerical representations that are indispensable for a wide range of machine learning models. Existing encoder benchmark studies lack generalizability because of their limited choice of (1) encoders, (2) experimental factors, and (3) datasets. Additionally, inconsistencies arise from the adoption of varying aggregation strategies. This paper is the most comprehensive benchmark of categorical encoders to date, including an extensive evaluation of 32 configurations of encoders from diverse families, with 36 combinations of experimental factors, and on 50 datasets. The study shows the profound influence of dataset selection, experimental factors, and aggregation strategies on the benchmark's conclusions -- aspects disregarded in previous encoder benchmarks.
Interpretability in Machine Learning: on the Interplay with Explainability, Predictive Performances and Models
Leblanc, Benjamin, Germain, Pascal
In some areas such as the medical field, ML-assisted predictions or decisions can drastically impact human life. For example, breast cancer [131] can be devastating if not diagnosed in time (or at all). The use of black-box predictors in these crucial cases has deceived more than once: a classical example of which is the use of the COMPAS system by the USA judiciary system for predicting criminal recidivism [133]. Other cases where fairness has been jeopardized by the use of black-boxes are numerous: job and loan applications biased toward men [40]; mortgage-approval biased toward white applicants [122]; higher credit card limits for men [172]; etc. With time, it became clear that interpretability is crucial when it comes to understanding how a predictor behaves and thus preventing unfortunate events; as pointed out by Goodman and Flaxman [70]: "If we do not know how ML [predictors] work, we cannot check or regulate them to ensure that they do not encode discrimination against minorities [...], we will not be able to learn from instances in which it is mistaken."
Several fitness functions and entanglement gates in quantum kernel generation
Quantum machine learning (QML) represents a promising frontier in the quantum technologies. In this pursuit of quantum advantage, the quantum kernel method for support vector machine has emerged as a powerful approach. Entanglement, a fundamental concept in quantum mechanics, assumes a central role in quantum computing. In this paper, we investigate the optimal number of entanglement gates in the quantum kernel feature maps by a multi-objective genetic algorithm. We distinct the fitness functions of genetic algorithm for non-local gates for entanglement and local gates to gain insights into the benefits of employing entanglement gates. Our experiments reveal that the optimal configuration of quantum circuits for the quantum kernel method incorporates a proportional number of non-local gates for entanglement. The result complements the prior literature on quantum kernel generation where non-local gates were largely suppressed. Furthermore, we demonstrate that the separability indexes of data can be leveraged to estimate the number of non-local gates required for the quantum support vector machine's feature maps. This insight can be helpful in selecting appropriate parameters, such as the entanglement parameter, in various quantum programming packages like https://qiskit.org/ based on data analysis. Our findings offer valuable guidance for enhancing the efficiency and accuracy of quantum machine learning algorithms.
DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features
Belov, Vladimir, Erwin-Grabner, Tracy, Zeng, Ling-Li, Ching, Christopher R. K., Aleman, Andre, Amod, Alyssa R., Basgoze, Zeynep, Benedetti, Francesco, Besteher, Bianca, Brosch, Katharina, Bülow, Robin, Colle, Romain, Connolly, Colm G., Corruble, Emmanuelle, Couvy-Duchesne, Baptiste, Cullen, Kathryn, Dannlowski, Udo, Davey, Christopher G., Dols, Annemiek, Ernsting, Jan, Evans, Jennifer W., Fisch, Lukas, Fuentes-Claramonte, Paola, Gonul, Ali Saffet, Gotlib, Ian H., Grabe, Hans J., Groenewold, Nynke A., Grotegerd, Dominik, Hahn, Tim, Hamilton, J. Paul, Han, Laura K. M., Harrison, Ben J, Ho, Tiffany C., Jahanshad, Neda, Jamieson, Alec J., Karuk, Andriana, Kircher, Tilo, Klimes-Dougan, Bonnie, Koopowitz, Sheri-Michelle, Lancaster, Thomas, Leenings, Ramona, Li, Meng, Linden, David E. J., MacMaster, Frank P., Mehler, David M. A., Meinert, Susanne, Melloni, Elisa, Mueller, Bryon A., Mwangi, Benson, Nenadić, Igor, Ojha, Amar, Okamoto, Yasumasa, Oudega, Mardien L., Penninx, Brenda W. J. H., Poletti, Sara, Pomarol-Clotet, Edith, Portella, Maria J., Pozzi, Elena, Radua, Joaquim, Rodríguez-Cano, Elena, Sacchet, Matthew D., Salvador, Raymond, Schrantee, Anouk, Sim, Kang, Soares, Jair C., Solanes, Aleix, Stein, Dan J., Stein, Frederike, Stolicyn, Aleks, Thomopoulos, Sophia I., Toenders, Yara J., Uyar-Demir, Aslihan, Vieta, Eduard, Vives-Gilabert, Yolanda, Völzke, Henry, Walter, Martin, Whalley, Heather C., Whittle, Sarah, Winter, Nils, Wittfeld, Katharina, Wright, Margaret J., Wu, Mon-Ju, Yang, Tony T., Zarate, Carlos, Veltman, Dick J., Schmaal, Lianne, Thompson, Paul M., Goya-Maldonado, Roberto
Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible.
Learning Realistic Joint Space Boundaries for Range of Motion Analysis of Healthy and Impaired Human Arms
Keyvanian, Shafagh, Johnson, Michelle J., Figueroa, Nadia
A realistic human kinematic model that satisfies anatomical constraints is essential for human-robot interaction, biomechanics and robot-assisted rehabilitation. Modeling realistic joint constraints, however, is challenging as human arm motion is constrained by joint limits, inter- and intra-joint dependencies, self-collisions, individual capabilities and muscular or neurological constraints which are difficult to represent. Hence, physicians and researchers have relied on simple box-constraints, ignoring important anatomical factors. In this paper, we propose a data-driven method to learn realistic anatomically constrained upper-limb range of motion (RoM) boundaries from motion capture data. This is achieved by fitting a one-class support vector machine to a dataset of upper-limb joint space exploration motions with an efficient hyper-parameter tuning scheme. Our approach outperforms similar works focused on valid RoM learning. Further, we propose an impairment index (II) metric that offers a quantitative assessment of capability/impairment when comparing healthy and impaired arms. We validate the metric on healthy subjects physically constrained to emulate hemiplegia and different disability levels as stroke patients.
The Dark Side of the Language: Pre-trained Transformers in the DarkNet
Ranaldi, Leonardo, Nourbakhsh, Aria, Patrizi, Arianna, Ruzzetti, Elena Sofia, Onorati, Dario, Fallucchi, Francesca, Zanzotto, Fabio Massimo
Pre-trained Transformers are challenging human performances in many NLP tasks. The massive datasets used for pre-training seem to be the key to their success on existing tasks. In this paper, we explore how a range of pre-trained Natural Language Understanding models perform on definitely unseen sentences provided by classification tasks over a DarkNet corpus. Surprisingly, results show that syntactic and lexical neural networks perform on par with pre-trained Transformers even after fine-tuning. Only after what we call extreme domain adaptation, that is, retraining with the masked language model task on all the novel corpus, pre-trained Transformers reach their standard high results. This suggests that huge pre-training corpora may give Transformers unexpected help since they are exposed to many of the possible sentences.
Classification Methods Based on Machine Learning for the Analysis of Fetal Health Data
Regmi, Binod, Shah, Chiranjibi
The persistent battle to decrease childhood mortality serves as a commonly employed benchmark for gauging advancements in the field of medicine. Globally, the under-5 mortality rate stands at approximately 5 million, with a significant portion of these deaths being avoidable. Given the significance of this problem, Machine learning-based techniques have emerged as a prominent tool for assessing fetal health. In this work, we have analyzed the classification performance of various machine learning models for fetal health analysis. Classification performance of various machine learning models, such as support vector machine (SVM), random forest(RF), and attentive interpretable tabular learning (TabNet) have been assessed on fetal health. Moreover, dimensionality reduction techniques, such as Principal component analysis (PCA) and Linear discriminant analysis (LDA) have been implemented to obtain better classification performance with less number of features. A TabNet model on a fetal health dataset provides a classification accuracy of 94.36%. In general, this technology empowers doctors and healthcare experts to achieve precise fetal health classification and identify the most influential features in the process.