AITopics | Luigi Carratino

Learning with SGD and Random Features

Luigi Carratino, Alessandro Rudi, Lorenzo Rosasco

Neural Information Processing SystemsMay-26-2025, 13:13:41 GMT

Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments.

artificial intelligence, machine learning, random feature, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Italy (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

On Fast Leverage Score Sampling and Optimal Learning

Alessandro Rudi, Daniele Calandriello, Luigi Carratino, Lorenzo Rosasco

Neural Information Processing SystemsMay-26-2025, 06:33:01 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, leverage score, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.14)
Oceania > Australia (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Learning with SGD and Random Features

Luigi Carratino, Alessandro Rudi, Lorenzo Rosasco

Neural Information Processing SystemsMar-26-2025, 08:12:14 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, random feature, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On Fast Leverage Score Sampling and Optimal Learning

Alessandro Rudi, Daniele Calandriello, Luigi Carratino, Lorenzo Rosasco

Neural Information Processing SystemsMar-26-2025, 01:26:04 GMT

Leverage score sampling provides an appealing way to perform approximate computations for large matrices. Indeed, it allows to derive faithful approximations with a complexity adapted to the problem at hand. Yet, performing leverage scores sampling is a challenge in its own right requiring further approximations. In this paper, we study the problem of leverage score sampling for positive definite matrices defined by a kernel.

artificial intelligence, leverage score, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.14)
Oceania > Australia (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

FALKON: An Optimal Large Scale Kernel Method

Alessandro Rudi, Luigi Carratino, Lorenzo Rosasco

Neural Information Processing SystemsOct-7-2024, 12:49:53 GMT

Kernel methods provide a principled way to perform non linear, nonparametric learning. They rely on solid functional analytic foundations and enjoy optimal statistical properties. However, at least in their basic form, they have limited applicability in large scale scenarios because of stringent computational requirements in terms of time and especially memory. In this paper, we take a substantial step in scaling up kernel methods, proposing FALKON, a novel algorithm that allows to efficiently process millions of points. FALKON is derived combining several algorithmic principles, namely stochastic subsampling, iterative solvers and preconditioning. Our theoretical analysis shows that optimal statistical accuracy is achieved requiring essentially O(n) memory and O(n n) time. An extensive experimental analysis on large scale datasets shows that, even with a single machine, FALKON outperforms previous state of the art solutions, which exploit parallel/distributed architectures.

artificial intelligence, falkon, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.82)

Add feedback

FALKON: An Optimal Large Scale Kernel Method

Alessandro Rudi, Luigi Carratino, Lorenzo Rosasco

Neural Information Processing SystemsOct-2-2024, 15:56:30 GMT

Kernel methods provide a principled way to perform non linear, nonparametric learning. They rely on solid functional analytic foundations and enjoy optimal statistical properties. However, at least in their basic form, they have limited applicability in large scale scenarios because of stringent computational requirements in terms of time and especially memory. In this paper, we take a substantial step in scaling up kernel methods, proposing FALKON, a novel algorithm that allows to efficiently process millions of points. FALKON is derived combining several algorithmic principles, namely stochastic subsampling, iterative solvers and preconditioning. Our theoretical analysis shows that optimal statistical accuracy is achieved requiring essentially O(n) memory and O(n n) time. An extensive experimental analysis on large scale datasets shows that, even with a single machine, FALKON outperforms previous state of the art solutions, which exploit parallel/distributed architectures.

artificial intelligence, falkon, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.46)

Technology: