AITopics | Statistical Learning

On Sample Optimality in Personalized Collaborative and Federated Learning

Neural Information Processing SystemsApr-24-2026, 07:38:01 GMT

In personalized federated learning, each member of a potentially large set of agents aims to train a model minimizing its loss function averaged over its local data distribution. We study this problem under the lens of stochastic optimization, focusing on a scenario with a large number of agents, that each possess very few data samples from their local data distribution. Specifically, we prove novel matching lower and upper bounds on the number of samples required from all agents to approximately minimize the generalization error of a fixed agent. We provide strategies matching these lower bounds, based on a gradient filtering approach: given prior knowledge on some notion of distance between local data distributions, agents filter and aggregate stochastic gradients received from other agents, in order to achieve an optimal bias-variance trade-off. Finally, we quantify the impact of using rough estimations of the distances between local distributions of agents, based on a very small number of local samples.

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Add feedback

01cea7793f3c68af2e4989fc66bf8fb0-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:37:58 GMT

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

054e9f9a286671ababa3213d6e59c1c2-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:37:51 GMT

artificial intelligence, generalization error, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Minimum Description Length and Generalization Guarantees for Representation Learning

Neural Information Processing SystemsApr-24-2026, 07:37:47 GMT

A major challenge in designing efficient statistical supervised learning algorithms is finding representations that perform well not only on available training samples but also on unseen data. While the study of representation learning has spurred much interest, most existing such approaches are heuristic; and very little is known about theoretical generalization guarantees. For example, the information bottleneck method seeks a good generalization by finding a minimal description of the input that is maximally informative about the label variable, where minimality and informativeness are both measured by Shannon's mutual information. In this paper, we establish a compressibility framework that allows us to derive upper bounds on the generalization error of a representation learning algorithm in terms of the "Minimum Description Length" (MDL) of the labels or the latent variables (representations). Rather than the mutual information between the encoder's input and the representation, which is often believed to reflect the algorithm's generalization capability in the related literature but in fact, falls short of doing so, our new bounds involve the "multi-letter" relative entropy between the distribution of the representations (or labels) of the training and test sets and a fixed prior.

artificial intelligence, generalization error, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.60)

Add feedback

Uncovering Neural Scaling Laws in Molecular Representation Learning

Neural Information Processing SystemsApr-24-2026, 07:36:31 GMT

Molecular Representation Learning (MRL) has emerged as a powerful tool for drug and materials discovery in a variety of tasks such as virtual screening and inverse design. While there has been a surge of interest in advancing modelcentric techniques, the influence of both data quantity and quality on molecular representations is not yet clearly understood within this field.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.46)
Europe > Austria (0.28)
Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Therapeutic Area > Immunology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

01025a4e79355bb37a10ba39605944b5-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:36:24 GMT

artificial intelligence, learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

0525a72df7fb2cd943c780d059b94774-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:35:55 GMT

artificial intelligence, machine learning, offline sgd, (19 more...)

Neural Information Processing Systems

Country: Europe > France (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

008079ec00eec9760ee93af5434ee932-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:17:16 GMT

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Neural Information Processing SystemsApr-24-2026, 07:16:37 GMT

Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is a simple to implement, black-box method, that does not require the user to specify a low-cost posterior approximation. It is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that our method provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required.

artificial intelligence, machine learning, theorem 4, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: