AITopics | Support Vector Machines

Collaborating Authors

Support Vector Machines

Support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Interpretable Kernels

Groenen, Patrick J. F., Greenacre, Michael

arXiv.org Machine LearningAug-25-2025

The use of kernels for nonlinear prediction is widespread in machine learning. They have been popularized in support vector machines and used in kernel ridge regression, amongst others. Kernel methods share three aspects. First, instead of the original matrix of predictor variables or features, each observation is mapped into an enlarged feature space. Second, a ridge penalty term is used to shrink the coefficients on the features in the enlarged feature space. Third, the solution is not obtained in this enlarged feature space, but through solving a dual problem in the observation space. A major drawback in the present use of kernels is that the interpretation in terms of the original features is lost. In this paper, we argue that in the case of a wide matrix of features, where there are more features than observations, the kernel solution can be re-expressed in terms of a linear combination of the original matrix of features and a ridge penalty that involves a special metric. Consequently, the exact same predicted values can be obtained as a weighted linear combination of the features in the usual manner and thus can be interpreted. In the case where the number of features is less than the number of observations, we discuss a least-squares approximation of the kernel matrix that still allows the interpretation in terms of a linear combination. It is shown that these results hold for any function of a linear combination that minimizes the coefficients and has a ridge penalty on these coefficients, such as in kernel logistic regression and kernel Poisson regression. This work makes a contribution to interpretable artificial intelligence.

artificial intelligence, kernel, machine learning, (19 more...)

arXiv.org Machine Learning

2508.15932

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Netherlands > South Holland > Rotterdam (0.04)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.50)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

A State-Space Approach to Nonstationary Discriminant Analysis

Xie, Shuilian, Imani, Mahdi, Dougherty, Edward R., Braga-Neto, Ulisses M.

arXiv.org Artificial IntelligenceAug-25-2025

Classical discriminant analysis assumes identically distributed training data, yet in many applications observations are collected over time and the class-conditional distributions drift. This population drift renders stationary classifiers unreliable. We propose a principled, model-based framework that embeds discriminant analysis within state-space models to obtain nonstationary linear discriminant analysis (NSLDA) and nonstationary quadratic discriminant analysis (NSQDA). For linear-Gaussian dynamics, we adapt Kalman smoothing to handle multiple samples per time step and develop two practical extensions: (i) an expectation-maximization (EM) approach that jointly estimates unknown system parameters, and (ii) a Gaussian mixture model (GMM)-Kalman method that simultaneously recovers unobserved time labels and parameters, a scenario common in practice. To address nonlinear or non-Gaussian drift, we employ particle smoothing to estimate time-varying class centroids, yielding fully nonstationary discriminant rules. Extensive simulations demonstrate consistent improvements over stationary linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) baselines, with robustness to noise, missing data, and class imbalance. This paper establishes a unified and data-efficient foundation for discriminant analysis under temporal distribution shift.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.16073

Country: Europe (0.28)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Mining Mental Health Signals: A Comparative Study of Four Machine Learning Methods for Depression Detection from Social Media Posts in Sorani Kurdish

Mohammed, Idrees, Hassani, Hossein

arXiv.org Artificial IntelligenceAug-25-2025

Depression is a common mental health condition that can lead to hopelessness, loss of interest, self-harm, and even suicide. Early detection is challenging due to individuals not self-reporting or seeking timely clinical help. With the rise of social media, users increasingly express emotions online, offering new opportunities for detection through text analysis. While prior research has focused on languages such as English, no studies exist for Sorani Kurdish. This work presents a machine learning and Natural Language Processing (NLP) approach to detect depression in Sorani tweets. A set of depression-related keywords was developed with expert input to collect 960 public tweets from X (Twitter platform). The dataset was annotated into three classes: Shows depression, Not-show depression, and Suspicious by academics and final year medical students at the University of Kurdistan Hewlêr. Four supervised models, including Support Vector Machines, Multinomial Naive Bayes, Logistic Regression, and Random Forest, were trained and evaluated, with Random Forest achieving the highest performance accuracy and F1-score of 80%. This study establishes a baseline for automated depression detection in Kurdish language contexts.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2508.15829

Country: Asia > Middle East > Iraq > Kurdistan Region (0.35)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music

Kim, Alexander, Botha, Charlotte

arXiv.org Artificial IntelligenceAug-22-2025

For singers of all experience levels, one of the most fun and daunting challenges in learning, technical repertoire is navigating placement and vocal register in and around the passagio (passage between chest voice and head voice registers). Contemporary Pop and Musical Theater solos increasingly demand strong command through and above the first passagio, and the use of various timbre and textures to achieve a desired quality. Thus, it can be difficult to identify what vocal register within the vocal range a singer is using even for advanced vocalists. This paper presents two methods for classifying vocal registers in an audio signal of male pop music through the end-to-end analysis of textural features of mel-spectrogram images. Additionally, we will discuss the practical integration of these models for vocal analysis tools, and introduce a concurrently developed software called AVRA which stands for Automatic Vocal Register Analysis. Our proposed methods achieved consistent classification of vocal register through both Support Vector Machine (SVM) and Convolutional Neural Network (CNN) models, which shows promise for robust classification possibilities across a greater range of voice types and genre.

artificial intelligence, machine learning, vocal register, (14 more...)

arXiv.org Artificial Intelligence

2505.11378

Genre: Research Report (0.51)

Industry:

Media > Music (0.72)
Leisure & Entertainment (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Add feedback

cb1ba6a42814bf83974ed45ffdb72efa-Paper-Conference.pdf

Neural Information Processing SystemsAug-21-2025, 17:37:53 GMT

data quality, diffusion model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia (0.14)
North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.45)

Add feedback

Multiclass Learning from Contradictions

Sauptik Dhar, Vladimir Cherkassky, Mohak Shah

Neural Information Processing SystemsAug-20-2025, 09:57:26 GMT

We introduce the notion of learning from contradictions, a.k.a

formulation, mu-svm, universum sample, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada (0.04)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.50)

Add feedback

Classifying Clinical Outcome of Epilepsy Patients with Ictal Chirp Embeddings

Bahador, Nooshin, Lankarany, Milad

arXiv.org Artificial IntelligenceAug-20-2025

This study presents a pipeline leveraging t-Distributed Stochastic Neighbor Embedding (t-SNE) for interpretable visualizations of chirp features across diverse outcome scenarios. The dataset, comprising chirp-based temporal, spectral, and frequency metrics. Using t-SNE, local neighborhood relationships were preserved while addressing the crowding problem through Student t-distribution-based similarity optimization. Three classification tasks were formulated on the 2D t-SNE embeddings: (1) distinguishing clinical success from failure/no-resection, (2) separating high-difficulty from low-difficulty cases, and (3) identifying optimal cases, defined as successful outcomes with minimal clinical difficulty. Four classifiers, namely, Random Forests, Support Vector Machines, Logistic Regression, and k-Nearest Neighbors, were trained and evaluated using stratified 5-fold cross-validation. Across tasks, the Random Forest and k-NN classifiers demonstrated superior performance, achieving up to 88.8% accuracy in optimal case detection (successful outcomes with minimal clinical difficulty). Additionally, feature influence sensitivity maps were generated using SHAP explanations applied to model predicting t-SNE coordinates, revealing spatially localized feature importance within the embedding space. These maps highlighted how specific chirp attributes drive regional clustering and class separation, offering insights into the latent structure of the data. The integrated framework showcases the potential of interpretable embeddings and local feature attribution for clinical stratification and decision support.

artificial intelligence, logistic regression, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.13476

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Epilepsy (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Dynamic Design of Machine Learning Pipelines via Metalearning

Alcobaça, Edesio, de Carvalho, André C. P. L. F.

arXiv.org Artificial IntelligenceAug-20-2025

Automated Machine Learning (AutoML) has become an essential tool for democratizing machine learning (ML) by automating key aspects of model selection, hyperparameter tuning, and feature engineering [1, 2]. However, the efficiency of AutoML frameworks remains a significant challenge, as the search for optimal configurations is often computationally expensive [3-5]. Traditional search strategies, such as Random Search (RS) and Bayesian Optimization (BO), indiscriminately explore large search spaces, resulting in high resource consumption [3, 6, 7]. To address this challenge, we propose a metalearning approach that dynamically designs search spaces for an AutoML solution, reducing computational costs while maintaining competitive predictive performance. The proposed method leverages historical metaknowledge to identify and prioritize promising regions of the search space, enabling more efficient optimization. By predicting the performance of preprocessor-classifier combinations, a meta-model, induced using metalearning, can provide a warm-start advantage, accelerating the AutoML search process. This study evaluates the effectiveness of the proposed approach through an extensive set of experiments, analyzing both computational efficiency and predictive performance. According to the experimental results, the dynamically generated search spaces significantly reduce runtime, while maintaining high-quality solutions. In particular, the RS-mtl-95 configuration achieved an 89% reduction in runtime without compromising predictive performance.

artificial intelligence, machine learning, search space, (17 more...)

arXiv.org Artificial Intelligence

2508.13436

Country:

Europe (1.00)
North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

Chen, Shengzhuang, Ouyang, Xu, Pearce, Michael Arthur Leopold, Hartvigsen, Thomas, Schwarz, Jonathan Richard

arXiv.org Machine LearningAug-19-2025

Determining the optimal data mixture for large language model training remains a challenging problem with an outsized impact on performance. In practice, language model developers continue to rely on heuristic exploration since no learning-based approach has emerged as a reliable solution. In this work, we propose to view the selection of training data mixtures as a black-box hyperparameter optimization problem, for which Bayesian Optimization is a well-established class of appropriate algorithms. Firstly, we cast data mixture learning as a sequential decision-making problem, in which we aim to find a suitable trade-off between the computational cost of training exploratory (proxy-) models and final mixture performance. Secondly, we systematically explore the properties of transferring mixtures learned at a small scale to larger-scale experiments, providing insights and highlighting opportunities for research at a modest scale. By proposing Multi-fidelity Bayesian Optimization as a suitable method in this common scenario, we introduce a natural framework to balance experiment cost with model fit, avoiding the risks of overfitting to smaller scales while minimizing the number of experiments at high cost. We present results for pre-training and instruction finetuning across models ranging from 1 million to 7 billion parameters, varying from simple architectures to state-of-the-art models and benchmarks spanning dozens of datasets. We demonstrate consistently strong results relative to a wide range of baselines, resulting inspeed-ups of over 500% in determining the best data mixture on our largest experiments. In addition, we broaden access to research by sharing ADMIRE IFT Runs, a dataset of 460 full training & evaluation runs worth over 13,000 GPU hours, greatly reducing the cost of conducting research in this area.

data mixture, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2508.11551

Country:

North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
(3 more...)

Add feedback

PCA- and SVM-Grad-CAM for Convolutional Neural Networks: Closed-form Jacobian Expression

Omae, Yuto

arXiv.org Artificial IntelligenceAug-19-2025

Convolutional Neural Networks (CNNs) are an effective approach for classification tasks, particularly when the training dataset is large. Although CNNs have long been considered a black-box classification method, they can be used as a white-box method through visualization techniques such as Grad-CAM. When training samples are limited, incorporating a Principal Component Analysis (PCA) layer and/or a Support Vector Machine (SVM) classifier into a CNN can effectively improve classification performance. However, traditional Grad-CAM cannot be directly applied to PCA and/or SVM layers. It is important to generate attention regions for PCA and/or SVM layers in CNNs to facilitate the development of white-box methods. Therefore, we propose ``PCA-Grad-CAM'', a method for visualizing attention regions in PCA feature vectors, and ``SVM-Grad-CAM'', a method for visualizing attention regions in an SVM classifier layer. To complete our methods analytically, it is necessary to solve the closed-form Jacobian consisting of partial derivatives from the last convolutional layer to the PCA and/or SVM layers. In this paper, we present the exact closed-form Jacobian and the visualization results of our methods applied to several major datasets.

artificial intelligence, machine learning, principal component, (16 more...)

arXiv.org Artificial Intelligence

2508.1188

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.93)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback