AITopics | kernel machine

Support Vector Generation: Kernelizing Zero-Shot Classifiers from Pre-Trained Language Models

Neural Information Processing SystemsJun-19-2026, 15:13:25 GMT

We introduce Support Vector Generation (SVG), a kernel-based framework that converts a frozen language model into an interpretable, training-free classifier for zero-and few-shot learning. SVG operates by combining Metropolis-Hastings sampling with support vector machine optimization in the reproducing kernel Hilbert space (RKHS) induced by the language model's embedding. Each classification decision is based on a weighted combination of at most 32 natural-language sentences, which serve as explicit support vectors and provide faithful rationales. Our theoretical analysis proves that SVG minimizes the empirical hinge loss over the span of the supports and admits a generalization bound independent of the language model size. Experiments on the GLUE benchmark show that SVG matches or surpasses prompting-based zero-shot baselines in accuracy across multiple tasks--without any fine-tuning or GPU acceleration. Notably, our CPU-only implementation completes training in under three minutes per task, and maintains competitive inference speed. These results suggest that SVG offers a viable path toward efficient, interpretable NLP systems under compute constraints.

justification, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Fast Training of Large Kernel Models with Delayed Projections

Neural Information Processing SystemsJun-11-2026, 22:25:55 GMT

Classical kernel machines have historically faced significant challenges in scaling to large datasets and model sizes--a key ingredient that has driven the success of neural networks. In this paper, we present a new methodology for building kernel machines that can scale efficiently with both data size and model size. Our algorithm introduces delayed projections to Preconditioned Stochastic Gradient Descent (PSGD) allowing the training of much larger models than was previously feasible.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

Add feedback

Analyzing Generalization of Neural Networks through Loss Path Kernels

Neural Information Processing SystemsApr-30-2026, 01:33:24 GMT

Deep neural networks have been increasingly used in real-world applications, making it critical to ensure their ability to adapt to new, unseen data. In this paper, we study the generalization capability of neural networks trained with (stochastic) gradient flow. We establish a new connection between the loss dynamics of gradient flow and general kernel machines by proposing a new kernel, called loss path kernel. This kernel measures the similarity between two data points by evaluating the agreement between loss gradients along the path determined by the gradient flow. Based on this connection, we derive a new generalization upper bound that applies to general neural network architectures. This new bound is tight and strongly correlated with the true generalization error. We apply our results to guide the design of neural architecture search (NAS) and demonstrate favorable performance compared with state-of-the-art NAS algorithms through numerical experiments.

artificial intelligence, generalization, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing SystemsApr-23-2026, 13:59:15 GMT

Kernel machines as well as neural networks possess universal function approximation properties. Nevertheless in practice their ways of choosing the appropriate function class differ. Specifically neural networks learn a representation by adapting their basis functions to the data and the task at hand, while kernel methods typically use a basis that is not adapted during training. In this work, we contrast random features of approximated kernel machines with learned features of neural networks. Our analysis reveals how these random and adaptive basis functions affect the quality of learning. Furthermore, we present basis adaptation schemes that allow for a more compact representation, while retaining the generalization properties of kernel machines.

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing SystemsMar-17-2026, 16:14:08 GMT

Kernel machines as well as neural networks possess universal function approximation properties. Nevertheless in practice their ways of choosing the appropriate function class differ. Specifically neural networks learn a representation by adapting their basis functions to the data and the task at hand, while kernel methods typically use a basis that is not adapted during training. In this work, we contrast random features of approximated kernel machines with learned features of neural networks. Our analysis reveals how these random and adaptive basis functions affect the quality of learning. Furthermore, we present basis adaptation schemes that allow for a more compact representation, while retaining the generalization properties of kernel machines.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)

Add feedback

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Neural Information Processing SystemsMar-17-2026, 01:38:56 GMT

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for ``overfitted'' / interpolated classifiers appears to be ubiquitous in high-dimensional data, having been observed in deep networks, kernel machines, boosting and random forests. Their performance is consistently robust even when the data contain large amounts of label noise. Very little theory is available to explain these observations. The vast majority of theoretical analyses of generalization allows for interpolation only when there is little or no label noise. This paper takes a step toward a theoretical foundation for interpolated classifiers by analyzing local interpolating schemes, including geometric simplicial interpolation algorithm and singularly weighted $k$-nearest neighbor schemes. Consistency or near-consistency is proved for these schemes in classification and regression problems.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams

Neural Information Processing SystemsMar-16-2026, 23:25:46 GMT

Algebraic topology methods have recently played an important role for statistical analysis with complicated geometric structured data such as shapes, linked twist maps, and material data. Among them, \textit{persistent homology} is a well-known tool to extract robust topological features, and outputs as \textit{persistence diagrams} (PDs). However, PDs are point multi-sets which can not be used in machine learning algorithms for vector data. To deal with it, an emerged approach is to use kernel methods, and an appropriate geometry for PDs is an important factor to measure the similarity of PDs. A popular geometry for PDs is the \textit{Wasserstein metric}.

artificial intelligence, kernel, machine learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e0b6f389739496e363a89155c9448a8a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 13:58:31 GMT

artificial intelligence, generalization, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Add feedback

883f66687a521536c505f9b2fbdcbf1e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 10:18:14 GMT

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Adapting Tensor Kernel Machines to Enable Efficient Transfer Learning for Seizure Detection

de Rooij, Seline J. S., Hunyadi, Borbála

arXiv.org Artificial IntelligenceDec-3-2025

Transfer learning aims to optimize performance in a target task by learning from a related source problem. In this work, we propose an efficient transfer learning method using a tensor kernel machine. Our method takes inspiration from the adaptive SVM and hence transfers 'knowledge' from the source to the 'adapted' model via regularization. The main advantage of using tensor kernel machines is that they leverage low-rank tensor networks to learn a compact non-linear model in the primal domain. This allows for a more efficient adaptation without adding more parameters to the model. To demonstrate the effectiveness of our approach, we apply the adaptive tensor kernel machine (Adapt-TKM) to seizure detection on behind-the-ear EEG. By personalizing patient-independent models with a small amount of patient-specific data, the patient-adapted model (which utilizes the Adapt-TKM), achieves better performance compared to the patient-independent and fully patient-specific models. Notably, it is able to do so while requiring around 100 times fewer parameters than the adaptive SVM model, leading to a correspondingly faster inference speed. This makes the Adapt-TKM especially useful for resource-constrained wearable devices.

artificial intelligence, kernel machine, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2512.02626

Country: Europe > Austria (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

Add feedback

Filters

Collaborating Authors

kernel machine

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Support Vector Generation: Kernelizing Zero-Shot Classifiers from Pre-Trained Language Models

Fast Training of Large Kernel Models with Delayed Projections

Analyzing Generalization of Neural Networks through Loss Path Kernels

An Empirical Study on The Properties of Random Bases for Kernel Methods

An Empirical Study on The Properties of Random Bases for Kernel Methods

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams

e0b6f389739496e363a89155c9448a8a-Paper-Conference.pdf

883f66687a521536c505f9b2fbdcbf1e-Paper-Conference.pdf

Adapting Tensor Kernel Machines to Enable Efficient Transfer Learning for Seizure Detection