kernel value
- Asia > Pakistan > Punjab > Lahore Division > Lahore (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey (0.04)
- (2 more...)
Latent Support Measure Machines for Bag-of-Words Data Classification
Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada
In many classification problems, the input is represented as a set of features, e.g., the bag-of-words (BoW) representation of documents. Support vector machines (SVMs) are widely used tools for such classification problems. The performance of the SVMs is generally determined by whether kernel values between data points can be defined properly. However, SVMs for BoW representations have a major weakness in that the co-occurrence of different but semantically similar words cannot be reflected in the kernel calculation. To overcome the weakness, we propose a kernel-based discriminative classifier for BoW data, which we call the latent support measure machine (latent SMM). With the latent SMM, a latent vector is associated with each vocabulary term, and each document is represented as a distribution of the latent vectors for words appearing in the document. To represent the distributions efficiently, we use the kernel embeddings of distributions that hold high order moment information about distributions. Then the latent SMM finds a separating hyperplane that maximizes the margins between distributions of different classes while estimating latent vectors for words to improve the classification performance. In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for BoW text classification, is robust with respect to its own hyper-parameters, and is useful to visualize words.
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Reviews: Space and Time Efficient Kernel Density Estimation in High Dimensions
Overall the paper is an average paper but clearly written. This paper proposes an improvement of Charikar's approach to achieve sublinear kernel density estimation with linear space and linear time preprocessing. Experimental results focus mainly on Laplacian (L1 variant in the main submission and L2 variant added in supplement). The key observation for achieving linear space is to modify the previous HBE approach so that each hash table stores each point in the dataset with constant probability - in this way, the superlinear storage cost is overcome. However, my main complaint is in the experimental results.
Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification
Muhammad Farhan, Juvaria Tariq, Arif Zaman, Mudassir Shabbir, Imdad Ullah Khan
Sequence classification algorithms, such as SVM, require a definition of distance (similarity) measure between two sequences. A commonly used notion of similarity is the number of matches between k-mers (k-length subsequences) in the two sequences. Extending this definition, by considering two k-mers to match if their distance is at most m, yields better classification performance. This, however, makes the problem computationally much more complex. Known algorithms to compute this similarity have computational complexity that render them applicable only for small values of k and m.
- Asia > Pakistan > Punjab > Lahore Division > Lahore (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- (3 more...)
In Search of Quantum Advantage: Estimating the Number of Shots in Quantum Kernel Methods
Miroszewski, Artur, Asiani, Marco Fellous, Mielczarek, Jakub, Saux, Bertrand Le, Nalepa, Jakub
Quantum Machine Learning (QML) has gathered significant attention through approaches like Quantum Kernel Machines. While these methods hold considerable promise, their quantum nature presents inherent challenges. One major challenge is the limited resolution of estimated kernel values caused by the finite number of circuit runs performed on a quantum device. In this study, we propose a comprehensive system of rules and heuristics for estimating the required number of circuit runs in quantum kernel methods. We introduce two critical effects that necessitate an increased measurement precision through additional circuit runs: the spread effect and the concentration effect. The effects are analyzed in the context of fidelity and projected quantum kernels. To address these phenomena, we develop an approach for estimating desired precision of kernel values, which, in turn, is translated into the number of circuit runs. Our methodology is validated through extensive numerical simulations, focusing on the problem of exponential value concentration. We stress that quantum kernel methods should not only be considered from the machine learning performance perspective, but also from the context of the resource consumption. The results provide insights into the possible benefits of quantum kernel methods, offering a guidance for their application in quantum machine learning tasks.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- North America > United States > Indiana (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
- Energy (0.68)
- Information Technology (0.46)
- Health & Medicine (0.46)
- (2 more...)
QUACK: Quantum Aligned Centroid Kernel
Tscharke, Kilian, Issel, Sebastian, Debus, Pascal
Quantum computing (QC) seems to show potential for application in machine learning (ML). In particular quantum kernel methods (QKM) exhibit promising properties for use in supervised ML tasks. However, a major disadvantage of kernel methods is their unfavorable quadratic scaling with the number of training samples. Together with the limits imposed by currently available quantum hardware (NISQ devices) with their low qubit coherence times, small number of qubits, and high error rates, the use of QC in ML at an industrially relevant scale is currently impossible. As a small step in improving the potential applications of QKMs, we introduce QUACK, a quantum kernel algorithm whose time complexity scales linear with the number of samples during training, and independent of the number of training samples in the inference stage. In the training process, only the kernel entries for the samples and the centers of the classes are calculated, i.e. the maximum shape of the kernel for n samples and c classes is (n, c). During training, the parameters of the quantum kernel and the positions of the centroids are optimized iteratively. In the inference stage, for every new sample the circuit is only evaluated for every centroid, i.e. c times. We show that the QUACK algorithm nevertheless provides satisfactory results and can perform at a similar level as classical kernel methods with quadratic scaling during training. In addition, our (simulated) algorithm is able to handle high-dimensional datasets such as MNIST with 784 features without any dimensionality reduction.
Latent Support Measure Machines for Bag-of-Words Data Classification
In many classification problems, the input is represented as a set of features, e.g., the bag-of-words (BoW) representation of documents. Support vector machines (SVMs) are widely used tools for such classification problems. The performance of the SVMs is generally determined by whether kernel values between data points can be defined properly. However, SVMs for BoW representations have a major weakness in that the co-occurrence of different but semantically similar words cannot be reflected in the kernel calculation. To overcome the weakness, we propose a kernel-based discriminative classifier for BoW data, which we call the latent support measure machine (latent SMM). With the latent SMM, a latent vector is associated with each vocabulary term, and each document is represented as a distribution of the latent vectors for words appearing in the document. To represent the distributions efficiently, we use the kernel embeddings of distributions that hold high order moment information about distributions. Then the latent SMM finds a separating hyperplane that maximizes the margins between distributions of different classes while estimating latent vectors for words to improve the classification performance. In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for BoW text classification, is robust with respect to its own hyper-parameters, and is useful to visualize words.
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
An Exact Kernel Equivalence for Finite Classification Models
Bell, Brian, Geyer, Michael, Glickenstein, David, Fernandez, Amanda, Moore, Juston
We explore the equivalence between neural networks and kernel methods by deriving the first exact representation of any finite-size parametric classification model trained with gradient descent as a kernel machine. We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK and other non-exact path kernel formulations. We experimentally demonstrate that the kernel can be computed for realistic networks up to machine precision. We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks, particularly the way in which they generalize.
- North America > United States > Arizona (0.04)
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- (3 more...)
- Energy (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
IAN: Iterated Adaptive Neighborhoods for manifold learning and dimensionality estimation
Dyballa, Luciano, Zucker, Steven W.
Invoking the manifold assumption in machine learning requires knowledge of the manifold's geometry and dimension, and theory dictates how many samples are required. However, in applications data are limited, sampling may not be uniform, and manifold properties are unknown and (possibly) non-pure; this implies that neighborhoods must adapt to the local structure. We introduce an algorithm for inferring adaptive neighborhoods for data given by a similarity kernel. Starting with a locally-conservative neighborhood (Gabriel) graph, we sparsify it iteratively according to a weighted counterpart. In each step, a linear program yields minimal neighborhoods globally and a volumetric statistic reveals neighbor outliers likely to violate manifold geometry. We apply our adaptive neighborhoods to non-linear dimensionality reduction, geodesic computation and dimension estimation. A comparison against standard algorithms using, e.g., k-nearest neighbors, demonstrates their usefulness. Code for our algorithm will be available at https://github.com/dyballa/IAN
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (3 more...)
- Health & Medicine (0.67)
- Education (0.40)
Exponential concentration and untrainability in quantum kernel methods
Thanasilp, Supanut, Wang, Samson, Cerezo, M., Holmes, Zoë
Kernel methods in Quantum Machine Learning (QML) have recently gained significant attention as a potential candidate for achieving a quantum advantage in data analysis. Among other attractive properties, when training a kernel-based model one is guaranteed to find the optimal model's parameters due to the convexity of the training landscape. However, this is based on the assumption that the quantum kernel can be efficiently obtained from a quantum hardware. In this work we study the trainability of quantum kernels from the perspective of the resources needed to accurately estimate kernel values. We show that, under certain conditions, values of quantum kernels over different input data can be exponentially concentrated (in the number of qubits) towards some fixed value, leading to an exponential scaling of the number of measurements required for successful training. We identify four sources that can lead to concentration including: the expressibility of data embedding, global measurements, entanglement and noise. For each source, an associated concentration bound of quantum kernels is analytically derived. Lastly, we show that when dealing with classical data, training a parametrized data embedding with a kernel alignment method is also susceptible to exponential concentration. Our results are verified through numerical simulations for several QML tasks. Altogether, we provide guidelines indicating that certain features should be avoided to ensure the efficient evaluation and the trainability of quantum kernel methods.
- Asia > Singapore (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- (4 more...)
- Energy (0.46)
- Government (0.45)