Goto

Collaborating Authors

 Support Vector Machines


Generalization Bounds for Stochastic Gradient Descent via Localized \varepsilon -Covers

Neural Information Processing Systems

In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with P pieces, i.e., non-convex and non-smooth in general, the generalization error can be upper bounded by O(\sqrt{(\log n\log(nP))/n}), where n is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and K -means clustering for both hard and soft label setups, improving the previously known state-of-the-art rates.


Joker: Joint Optimization Framework for Lightweight Kernel Machines

arXiv.org Artificial Intelligence

Kernel methods are powerful tools for nonlinear learning with well-established theory. The scalability issue has been their long-standing challenge. Despite the existing success, there are two limitations in large-scale kernel methods: (i) The memory overhead is too high for users to afford; (ii) existing efforts mainly focus on kernel ridge regression (KRR), while other models lack study. In this paper, we propose Joker, a joint optimization framework for diverse kernel models, including KRR, logistic regression, and support vector machines. We design a dual block coordinate descent method with trust region (DBCD-TR) and adopt kernel approximation with randomized features, leading to low memory costs and high efficiency in large-scale learning. Experiments show that Joker saves up to 90\% memory but achieves comparable training time and performance (or even better) than the state-of-the-art methods.


Uncertainty Quantification in SVM prediction

arXiv.org Machine Learning

This paper explores Uncertainty Quantification (UQ) in SVM predictions, particularly for regression and forecasting tasks. Unlike the Neural Network, the SVM solutions are typically more stable, sparse, optimal and interpretable. However, there are only few literature which addresses the UQ in SVM prediction. At first, we provide a comprehensive summary of existing Prediction Interval (PI) estimation and probabilistic forecasting methods developed in the SVM framework and evaluate them against the key properties expected from an ideal PI model. We find that none of the existing SVM PI models achieves a sparse solution. To introduce sparsity in SVM model, we propose the Sparse Support Vector Quantile Regression (SSVQR) model, which constructs PIs and probabilistic forecasts by solving a pair of linear programs. Further, we develop a feature selection algorithm for PI estimation using SSVQR that effectively eliminates a significant number of features while improving PI quality in case of high-dimensional dataset. Finally we extend the SVM models in Conformal Regression setting for obtaining more stable prediction set with finite test set guarantees. Extensive experiments on artificial, real-world benchmark datasets compare the different characteristics of both existing and proposed SVM-based PI estimation methods and also highlight the advantages of the feature selection in PI estimation. Furthermore, we compare both, the existing and proposed SVM-based PI estimation models, with modern deep learning models for probabilistic forecasting tasks on benchmark datasets. Furthermore, SVM models show comparable or superior performance to modern complex deep learning models for probabilistic forecasting task in our experiments.


High-Dimensional Analysis of Bootstrap Ensemble Classifiers

arXiv.org Machine Learning

Bootstrap methods have long been a cornerstone of ensemble learning in machine learning. This paper presents a theoretical analysis of bootstrap techniques applied to the Least Square Support Vector Machine (LSSVM) ensemble in the context of large and growing sample sizes and feature dimensionalities. Leveraging tools from Random Matrix Theory, we investigate the performance of this classifier that aggregates decision functions from multiple weak classifiers, each trained on different subsets of the data. We provide insights into the use of bootstrap methods in high-dimensional settings, enhancing our understanding of their impact. Based on these findings, we propose strategies to select the number of subsets and the regularization parameter that maximize the performance of the LSSVM. Empirical experiments on synthetic and real-world datasets validate our theoretical results.


QSVM-QNN: Quantum Support Vector Machine Based Quantum Neural Network Learning Algorithm for Brain-Computer Interfacing Systems

arXiv.org Artificial Intelligence

A brain-computer interface (BCI) system enables direct communication between the brain and external devices, offering significant potential for assistive technologies and advanced human-computer interaction. Despite progress, BCI systems face persistent challenges, including signal variability, classification inefficiency, and difficulty adapting to individual users in real time. In this study, we propose a novel hybrid quantum learning model, termed QSVM-QNN, which integrates a Quantum Support Vector Machine (QSVM) with a Quantum Neural Network (QNN), to improve classification accuracy and robustness in EEG-based BCI tasks. Unlike existing models, QSVM-QNN combines the decision boundary capabilities of QSVM with the expressive learning power of QNN, leading to superior generalization performance. The proposed model is evaluated on two benchmark EEG datasets, achieving high accuracies of 0.990 and 0.950, outperforming both classical and standalone quantum models. To demonstrate real-world viability, we further validated the robustness of QNN, QSVM, and QSVM-QNN against six realistic quantum noise models, including bit flip and phase damping. These experiments reveal that QSVM-QNN maintains stable performance under noisy conditions, establishing its applicability for deployment in practical, noisy quantum environments. Beyond BCI, the proposed hybrid quantum architecture is generalizable to other biomedical and time-series classification tasks, offering a scalable and noise-resilient solution for next-generation neurotechnological systems.


Decoupling Collision Avoidance in and for Optimal Control using Least-Squares Support Vector Machines

arXiv.org Artificial Intelligence

-- This paper details an approach to linearise differentiable but non-convex collision avoidance constraints tailored to convex shapes. It revisits introducing differential collision avoidance constraints for convex objects into an optimal control problem (OCP) using the separating hyperplane theorem. By framing this theorem as a classification problem, the hyper-planes are eliminated as optimisation variables from the OCP . This effectively transforms non-convex constraints into linear constraints. A bi-level algorithm computes the hyperplanes between the iterations of an optimisation solver and subsequently embeds them as parameters into the OCP . Experiments demonstrate the approach's favourable scalability towards cluttered environments and its applicability to various motion planning approaches. It decreases trajectory computation times between 50% and 90% compared to a state-of-the-art approach that directly includes the hyperplanes as variables in the optimal control problem. Deploying autonomous robots in practical and real-life settings, e.g., a warehouse, industrial production cell, homes, etc. is a complex problem with many interesting challenges that remain.


Efficient Machine Unlearning by Model Splitting and Core Sample Selection

arXiv.org Machine Learning

Machine unlearning is essential for meeting legal obligations such as the right to be forgotten, which requires the removal of specific data from machine learning models upon request. While several approaches to unlearning have been proposed, existing solutions often struggle with efficiency and, more critically, with the verification of unlearning - particularly in the case of weak unlearning guarantees, where verification remains an open challenge. We introduce a generalized variant of the standard unlearning metric that enables more efficient and precise unlearning strategies. We also present an unlearning-aware training procedure that, in many cases, allows for exact unlearning. We term our approach MaxRR. When exact unlearning is not feasible, MaxRR still supports efficient unlearning with properties closely matching those achieved through full retraining.


GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models

arXiv.org Artificial Intelligence

Background and Objective. Research in the cross-modal medical image translation domain has been very productive over the past few years in tackling the scarce availability of large curated multimodality datasets with the promising performance of GAN-based architectures. However, only a few of these studies assessed task-based related performance of these synthetic data, especially for the training of deep models. Method. We design and compare different GAN-based frameworks for generating synthetic brain [18F]fluorodeoxyglucose (FDG) PET images from T1 weighted MRI data. We first perform standard qualitative and quantitative visual quality evaluation. Then, we explore further impact of using these fake PET data in the training of a deep unsupervised anomaly detection (UAD) model designed to detect subtle epilepsy lesions in T1 MRI and FDG PET images. We introduce novel diagnostic task-oriented quality metrics of the synthetic FDG PET data tailored to our unsupervised detection task, then use these fake data to train a use case UAD model combining a deep representation learning based on siamese autoencoders with a OC-SVM density support estimation model. This model is trained on normal subjects only and allows the detection of any variation from the pattern of the normal population. We compare the detection performance of models trained on 35 paired real MR T1 of normal subjects paired either on 35 true PET images or on 35 synthetic PET images generated from the best performing generative models. Performance analysis is conducted on 17 exams of epilepsy patients undergoing surgery. Results. The best performing GAN-based models allow generating realistic fake PET images of control subject with SSIM and PSNR values around 0.9 and 23.8, respectively and in distribution (ID) with regard to the true control dataset. The best UAD model trained on these synthetic normative PET data allows reaching 74% sensitivity. Conclusion. Our results confirm that GAN-based models are the best suited for MR T1 to FDG PET translation, outperforming transformer or diffusion models. We also demonstrate the diagnostic value of these synthetic data for the training of UAD models and evaluation on clinical exams of epilepsy patients. Our code and the normative image dataset are available.


Predictive Digital Twins for Thermal Management Using Machine Learning and Reduced-Order Models

arXiv.org Artificial Intelligence

Digital twins enable real-time simulation and prediction in engineering systems. This paper presents a novel framework for predictive digital twins of a headlamp heatsink, integrating physics-based reduced-order models (ROMs) from computational fluid dynamics (CFD) with supervised machine learning. A component-based ROM library, derived via proper orthogonal decomposition (POD), captures thermal dynamics efficiently. Machine learning models, including Decision Trees, k-Nearest Neighbors, Support Vector Regression (SVR), and Neural Networks, predict optimal ROM configurations, enabling rapid digital twin updates. The Neural Network achieves a mean absolute error (MAE) of 54.240, outperforming other models. Quantitative comparisons of predicted and original values demonstrate high accuracy. This scalable, interpretable framework advances thermal management in automotive systems, supporting robust design and predictive maintenance.


Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

arXiv.org Artificial Intelligence

Engagement between client and therapist is a critical determinant of therapeutic success. We propose a multi-dimensional natural language processing (NLP) framework that objectively classifies engagement quality in counseling sessions based on textual transcripts. Using 253 motivational interviewing transcripts (150 high-quality, 103 low-quality), we extracted 42 features across four domains: conversational dynamics, semantic similarity as topic alignment, sentiment classification, and question detection. Classifiers, including Random Forest (RF), Cat-Boost, and Support Vector Machines (SVM), were hyperparameter tuned and trained using a stratified 5-fold cross-validation and evaluated on a holdout test set. On balanced (non-augmented) data, RF achieved the highest classification accuracy (76.7%), and SVM achieved the highest AUC (85.4%). After SMOTE-Tomek augmentation, performance improved significantly: RF achieved up to 88.9% accuracy, 90.0% F1-score, and 94.6% AUC, while SVM reached 81.1% accuracy, 83.1% F1-score, and 93.6% AUC. The augmented data results reflect the potential of the framework in future larger-scale applications. Feature contribution revealed conversational dynamics and semantic similarity between clients and therapists were among the top contributors, led by words uttered by the client (mean and standard deviation). The framework was robust across the original and augmented datasets and demonstrated consistent improvements in F1 scores and recall. While currently text-based, the framework supports future multimodal extensions (e.g., vocal tone, facial affect) for more holistic assessments. This work introduces a scalable, data-driven method for evaluating engagement quality of the therapy session, offering clinicians real-time feedback to enhance the quality of both virtual and in-person therapeutic interactions.