Goto

Collaborating Authors

 gaussian kernel



GEQ: Gaussian Kernel Inspired Equilibrium Models

Neural Information Processing Systems

Moreover, GEQ can be perceived as a weight-tied neural network with infinite width and depth. GEQ also enjoys better theoretical properties and improved overall performance.


But How Does It Work in Theory? Linear SVM with Random Features

Yitong Sun, Anna Gilbert, Ambuj Tewari

Neural Information Processing Systems

The random features method, proposed by Rahimi and Recht [2008], maps the data to a finite dimensional feature space as a random approximation to the feature space of RBF kernels. With explicit finite dimensional feature vectors available, the original KSVM is converted to a linear support vector machine (LSVM), that can be trained by faster algorithms (Shalev-Shwartz et al.





GEQ: Gaussian Kernel Inspired Equilibrium Models

Neural Information Processing Systems

Despite the connection established by optimization-induced deep equilibrium models (OptEqs) between their output and the underlying hidden optimization problems, the performance of it along with its related works is still not good enough especially when compared to deep networks. One key factor responsible for this performance limitation is the use of linear kernels to extract features in these models. To address this issue, we propose a novel approach by replacing its linear kernel with a new function that can readily capture nonlinear feature dependencies in the input data. Drawing inspiration from classical machine learning algorithms, we introduce Gaussian kernels as the alternative function and then propose our new equilibrium model, which we refer to as GEQ. By leveraging Gaussian kernels, GEQ can effectively extract the nonlinear information embedded within the input features, surpassing the performance of the original OptEqs. Moreover, GEQ can be perceived as a weight-tied neural network with infinite width and depth. GEQ also enjoys better theoretical properties and improved overall performance. Additionally, our GEQ exhibits enhanced stability when confronted with various samples. We further substantiate the effectiveness and stability of GEQ through a series of comprehensive experiments.


Curriculum By Smoothing

Neural Information Processing Systems

Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation. Moreover, recent work in Generative Adversarial Networks (GANs) has highlighted the importance of learning by progressively increasing the difficulty of a learning task Kerras et al. When learning a network from scratch, the information propagated within the network during the earlier stages of training can contain distortion artifacts due to noise which can be detrimental to training. In this paper, we propose an elegant curriculum-based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters. We propose to augment the training of CNNs by controlling the amount of high frequency information propagated within the CNNs as training progresses, by convolving the output of a CNN feature map of each layer with a Gaussian kernel. By decreasing the variance of the Gaussian kernel, we gradually increase the amount of high-frequency information available within the network for inference. As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data. Our proposed augmented training scheme significantly improves the performance of CNNs on various vision tasks without either adding additional trainable parameters or an auxiliary regularization objective. The generality of our method is demonstrated through empirical performance gains in CNN architectures across four different tasks: transfer learning, cross-task transfer learning, and generative models.


Concentration bounds for intrinsic dimension estimation using Gaussian kernels

Andersson, Martin

arXiv.org Machine Learning

We prove finite-sample concentration and anti-concentration bounds for dimension estimation using Gaussian kernel sums. Our bounds provide explicit dependence on sample size, bandwidth, and local geometric and distributional parameters, characterizing precisely how regularity conditions govern statistical performance. We also propose a bandwidth selection heuristic using derivative information, which shows promise in numerical experiments.