AITopics | Support Vector Machines

Collaborating Authors

Support Vector Machines

Support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Support Vector Machine Classifier with Rescaled Huberized Pinball Loss

Diao, Shibo

arXiv.org Machine LearningDec-1-2025

Support vector machines are widely used in machine learning classification tasks, but traditional SVM models suffer from sensitivity to outliers and instability in resampling, which limits their performance in practical applications. To address these issues, this paper proposes a novel rescaled Huberized pinball loss function with asymmetric, non-convex, and smooth properties. Based on this loss function, we develop a corresponding SVM model called RHPSVM (Rescaled Huberized Pinball Loss Support Vector Machine). Theoretical analyses demonstrate that RHPSVM conforms to Bayesian rules, has a strict generalization error bound, a bounded influence function, and controllable optimality conditions, ensuring excellent classification accuracy, outlier insensitivity, and resampling stability. Additionally, RHPSVM can be extended to various advanced SVM variants by adjusting parameters, enhancing its flexibility. We transform the non-convex optimization problem of RHPSVM into a series of convex subproblems using the concave-convex procedure (CCCP) and solve it with the ClipDCD algorithm, which is proven to be convergent. Experimental results on simulated data, UCI datasets, and small-sample crop leaf image classification tasks show that RHPSVM outperforms existing SVM models in both noisy and noise-free scenarios, especially in handling high-dimensional small-sample data.

loss function, support vector machine, vector machine, (12 more...)

arXiv.org Machine Learning

2511.22065

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Integrated Transcriptomic-proteomic Biomarker Identification for Radiation Response Prediction in Non-small Cell Lung Cancer Cell Lines

Yu, Yajun, Xu, Guoping, Jiang, Steve, Timmerman, Robert, Minna, John, Zhang, Yuanyuan, Peng, Hao

arXiv.org Artificial IntelligenceDec-1-2025

To develop an integrated transcriptome-proteome framework for identifying concurrent biomarkers predictive of radiation response, as measured by survival fraction at 2 Gy (SF2), in non-small cell lung cancer (NSCLC) cell lines. RNA sequencing (RNA-seq) and data-independent acquisition mass spectrometry (DIA-MS) proteomic data were collected from 73 and 46 NSCLC cell lines, respectively. Following preprocessing, 1,605 shared genes were retained for analysis. Feature selection was performed using least absolute shrinkage and selection operator (Lasso) regression with a frequency-based ranking criterion under five-fold cross-validation repeated ten times. Support vector regression (SVR) models were constructed using transcriptome-only, proteome-only, and combined transcriptome-proteome feature sets. Model performance was assessed by the coefficient of determination (R2) and root mean square error (RMSE). Correlation analyses evaluated concordance between RNA and protein expression and the relationships of selected biomarkers with SF2. RNA-protein expression exhibited significant positive correlations (median Pearson's r = 0.363). Independent pipelines identified 20 prioritized gene signatures from transcriptomic, proteomic, and combined datasets. Models trained on single-omic features achieved limited cross-omic generalizability, while the combined model demonstrated balanced predictive accuracy in both datasets (R2=0.461, RMSE=0.120 for transcriptome; R2=0.604, RMSE=0.111 for proteome). This study presents the first proteotranscriptomic framework for SF2 prediction in NSCLC, highlighting the complementary value of integrating transcriptomic and proteomic data. The identified concurrent biomarkers capture both transcriptional regulation and functional protein activity, offering mechanistic insights and translational potential.

artificial intelligence, cell line, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.22735

Country: North America > United States > Texas (0.15)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

A centroid based framework for text classification in itsm environments

Mohanna, Hossein, Ait-Bachir, Ali

arXiv.org Artificial IntelligenceNov-27-2025

Text classification with hierarchical taxonomies is a fundamental requirement in IT Service Management (ITSM) systems, where support tickets must be categorized into tree-structured taxonomies. We present a dual-embedding centroid-based classification framework that maintains separate semantic and lexical centroid representations per category, combining them through reciprocal rank fusion at inference time. The framework achieves performance competitive with Support Vector Machines (hierarchical F1: 0.731 vs 0.727) while providing interpretability through centroid representations. Evaluated on 8,968 ITSM tickets across 123 categories, this method achieves 5.9 times faster training and up to 152 times faster incremental updates. With 8.6-8.8 times speedup across batch sizes (100-1000 samples) when excluding embedding computation. These results make the method suitable for production ITSM environments prioritizing interpretability and operational efficiency.

classification, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2511.20667

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Enterprise Applications > IT Service Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Efficient Penalty-Based Bilevel Methods: Improved Analysis, Novel Updates, and Flatness Condition

Jiang, Liuyuan, Xiao, Quan, Chen, Lisha, Chen, Tianyi

arXiv.org Machine LearningNov-24-2025

Penalty-based methods have become popular for solving bilevel optimization (BLO) problems, thanks to their effective first-order nature. However, they often require inner-loop iterations to solve the lower-level (LL) problem and small outer-loop step sizes to handle the increased smoothness induced by large penalty terms, leading to suboptimal complexity. This work considers the general BLO problems with coupled constraints (CCs) and leverages a novel penalty reformulation that decouples the upper- and lower-level variables. This yields an improved analysis of the smoothness constant, enabling larger step sizes and reduced iteration complexity for Penalty-Based Gradient Descent algorithms in ALTernating fashion (ALT-PBGD). Building on the insight of reduced smoothness, we propose PBGD-Free, a novel fully single-loop algorithm that avoids inner loops for the uncoupled constraint BLO. For BLO with CCs, PBGD-Free employs an efficient inner-loop with substantially reduced iteration complexity. Furthermore, we propose a novel curvature condition describing the "flatness" of the upper-level objective with respect to the LL variable. This condition relaxes the traditional upper-level Lipschitz requirement, enables smaller penalty constant choices, and results in a negligible penalty gradient term during upper-level variable updates. We provide rigorous convergence analysis and validate the method's efficacy through hyperparameter optimization for support vector machines and fine-tuning of large language models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2511.16796

Country: North America > United States > New York (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)
(2 more...)

Add feedback

Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain

Neural Information Processing SystemsNov-21-2025, 16:22:41 GMT

Many applications of machine learning involve structured output with large domain, where learning of structured predictor is prohibitive due to repetitive calls to expensive inference oracle. In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain. A Greedy Direction Method of Multiplier (GDMM) algorithm is proposed to exploit sparsity of messages which guarantees $\epsilon$ sub-optimality after $O(log(1/\epsilon))$ passes of FMO calls. We conduct experiments on chain-structured problems and fully-connected problems of large output domains. The proposed approach is orders-of-magnitude faster than the state-of-the-art training algorithms for Structural SVM.

dual decomposed learning, factorwise oracle, structural svm, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)

Add feedback

Parametric Simplex Method for Sparse Learning

Neural Information Processing SystemsNov-21-2025, 16:17:59 GMT

High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.

name change, parametric simplex method, sparse learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)

Add feedback

Variational Autoencoder for Deep Learning of Images, Labels and Captions

Neural Information Processing SystemsNov-21-2025, 15:33:24 GMT

A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to generative models for labels (Bayesian support vector machine) or captions (recurrent neural network). When predicting a label/caption for a new image at test, averaging is performed across the distribution of latent codes; this is computationally efficient as a consequence of the learned CNN-based encoder. Since the framework is capable of modeling the image in the presence/absence of associated labels/captions, a new semi-supervised setting is manifested for CNN learning with images; the framework even allows unsupervised CNN learning, based on images alone.

deep learning, name change, variational autoencoder, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)

Add feedback

Max-Margin Invariant Features from Transformed Unlabelled Data

Dipan Pal, Ashwin Kannan, Gautam Arakalgud, Marios Savvides

Neural Information Processing SystemsNov-21-2025, 10:41:26 GMT

The study of representations invariant to common transformations of the data is important to learning.

artificial intelligence, machine learning, transformation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)

Add feedback

Process-constrained batch Bayesian optimisation

Pratibha Vellanki, Santu Rana, Sunil Gupta, David Rubin, Alessandra Sutti, Thomas Dorin, Murray Height, Paul Sanders, Svetha Venkatesh

Neural Information Processing SystemsNov-21-2025, 05:32:30 GMT

Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested).

artificial intelligence, machine learning, optimisation, (17 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.14)
North America > United States > Michigan (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Industry: Health & Medicine (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Learning Confidence Sets using Support Vector Machines

Neural Information Processing SystemsNov-21-2025, 03:53:33 GMT

The goal of confidence-set learning in the binary classification setting is to construct two sets, each with a specific probability guarantee to cover a class. An observation outside the overlap of the two sets is deemed to be from one of the two classes, while the overlap is an ambiguity region which could belong to either class. Instead of plug-in approaches, we propose a support vector classifier to construct confidence sets in a flexible manner. Theoretically, we show that the proposed learner can control the non-coverage rates and minimize the ambiguity with high probability. Efficient algorithms are developed and numerical studies illustrate the effectiveness of the proposed method.

artificial intelligence, learning confidence set, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback