AITopics

2504.19835

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Automobiles & Trucks > Manufacturer (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

arXiv.org Machine LearningJun-11-2025

Federated Learning: From Theory to Practice

Jung, A.

This book offers a hands-on introduction to building and understanding federated learning (FL) systems. FL enables multiple devices -- such as smartphones, sensors, or local computers -- to collaboratively train machine learning (ML) models, while keeping their data private and local. It is a powerful solution when data cannot or should not be centralized due to privacy, regulatory, or technical reasons. The book is designed for students, engineers, and researchers who want to learn how to design scalable, privacy preserving FL systems. Our main focus is on personalization: enabling each device to train its own model while still benefiting from collaboration with relevant devices. This is achieved by leveraging similarities between (the learning tasks associated with) devices that are encoded by the weighted edges (or links) of a federated learning network (FL network). The key idea is to represent real-world FL systems as networks of devices, where nodes correspond to device and edges represent communication links and data similarities between them. The training of personalized models for these devices can be naturally framed as a distributed optimization problem. This optimization problem is referred to as generalized total variation minimization (GTVMin) and ensures that devices with similar learning tasks learn similar model parameters. Our approach is both mathematically principled and practically motivated. While we introduce some advanced ideas from optimization theory and graph-based learning, we aim to keep the book accessible. Readers are guided through the core ideas step by step, with intuitive explanations.

artificial intelligence, cambridge university press, machine learning, (20 more...)

2505.19183

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Oceania > Australia (0.27)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
(27 more...)

Genre:

Summary/Review (1.00)
Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.92)
(3 more...)

arXiv.org Artificial IntelligenceJun-10-2025

Improvement of Optimization using Learning Based Models in Mixed Integer Linear Programming Tasks

Wang, Xiaoke, Altundas, Batuhan, Li, Zhaoxin, Zhao, Aaron, Gombolay, Matthew

-- Mixed Integer Linear Programs (MILPs) are essential tools for solving planning and scheduling problems across critical industries such as construction, manufacturing, and logistics. However, their widespread adoption is limited by long computational times, especially in large-scale, real-time scenarios. T o address this, we present a learning-based framework that leverages Behavior Cloning (BC) and Reinforcement Learning (RL) to train Graph Neural Networks (GNNs), producing high-quality initial solutions for warm-starting MILP solvers in Multi-Agent T ask Allocation and Scheduling Problems. Experimental results demonstrate that our method reduces optimization time and variance compared to traditional techniques while maintaining solution quality and feasibility. I. INTRODUCTION Mixed Integer Linear Programs (MILPs) serve as a fundamental framework for combinatorial optimization problems, facilitating solutions across a wide range of planning and scheduling tasks in logistics [1], construction [2] and manufacturing [3].

artificial intelligence, milp solver, planning & scheduling, (15 more...)

2506.06291

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.47)
Construction & Engineering (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Boquet-Pujadas, Aleix, Pla, Pol del Aguila, Unser, Michael

Sensitivity-Aware Density Estimation in Multiple Dimensions

arXiv.org Artificial IntelligenceJun-4-2025

We formulate an optimization problem to estimate probability densities in the context of multidimensional problems that are sampled with uneven probability. It considers detector sensitivity as an heterogeneous density and takes advantage of the computational speed and flexible boundary conditions offered by splines on a grid. We choose to regularize the Hessian of the spline via the nuclear norm to promote sparsity. As a result, the method is spatially adaptive and stable against the choice of the regularization parameter, which plays the role of the bandwidth. We test our computational pipeline on standard densities and provide software. We also present a new approach to PET rebinning as an application of our framework.

artificial intelligence, density estimation, machine learning, (17 more...)

doi: 10.1109/TPAMI.2024.3388370

2506.02323

Country:

Europe (0.68)
North America > United States (0.28)

Genre:

Personal (0.67)
Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Dubey, Akshat, Anžel, Aleksandar, Hattab, Georges

Surrogate Interpretable Graph for Random Decision Forests

arXiv.org Artificial IntelligenceJun-4-2025

The field of health informatics has been profoundly influenced by the development of random forest models, which have led to significant advances in the interpretability of feature interactions. These models are characterized by their robustness to overfitting and parallelization, making them particularly useful in this domain. However, the increasing number of features and estimators in random forests can prevent domain experts from accurately interpreting global feature interactions, thereby compromising trust and regulatory compliance. A method called the surrogate interpretability graph has been developed to address this issue. It uses graphs and mixed-integer linear programming to analyze and visualize feature interactions. This improves their interpretability by visualizing the feature usage per decision-feature-interaction table and the most dominant hierarchical decision feature interactions for predictions. The implementation of a surrogate interpretable graph enhances global interpretability, which is critical for such a high-stakes domain.

artificial intelligence, machine learning, optimization problem, (18 more...)

2506.01988

Country: Europe > Germany (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.71)
(2 more...)

arXiv.org Artificial IntelligenceJun-4-2025

EVA-MILP: Towards Standardized Evaluation of MILP Instance Generation

Luo, Yidong, Wang, Chenguang, Yang, Jiahao, Xia, Fanzeng, Yu, Tianshu

Mixed-Integer Linear Programming (MILP) is fundamental to solving complex decision-making problems. The proliferation of MILP instance generation methods, driven by machine learning's demand for diverse optimization datasets and the limitations of static benchmarks, has significantly outpaced standardized evaluation techniques. Consequently, assessing the fidelity and utility of synthetic MILP instances remains a critical, multifaceted challenge. This paper introduces a comprehensive benchmark framework designed for the systematic and objective evaluation of MILP instance generation methods. Our framework provides a unified and extensible methodology, assessing instance quality across crucial dimensions: mathematical validity, structural similarity, computational hardness, and utility in downstream machine learning tasks. A key innovation is its in-depth analysis of solver-internal features -- particularly by comparing distributions of key solver outputs including root node gap, heuristic success rates, and cut plane usage -- leveraging the solver's dynamic solution behavior as an `expert assessment' to reveal nuanced computational resemblances. By offering a structured approach with clearly defined solver-independent and solver-dependent metrics, our benchmark aims to facilitate robust comparisons among diverse generation techniques, spur the development of higher-quality instance generators, and ultimately enhance the reliability of research reliant on synthetic MILP data. The framework's effectiveness in systematically comparing the fidelity of instance sets is demonstrated using contemporary generative models.

artificial intelligence, machine learning, optimization problem, (18 more...)

2505.24779

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Compagnoni, Enea Monzio, Islamov, Rustem, Orvieto, Antonio, Gorbunov, Eduard

On the Interaction of Noise, Compression Role, and Adaptivity under $(L_0, L_1)$-Smoothness: An SDE-based Approach

arXiv.org Machine LearningJun-3-2025

Using stochastic differential equation (SDE) approximations, we study the dynamics of Distributed SGD, Distributed Compressed SGD, and Distributed SignSGD under $(L_0,L_1)$-smoothness and flexible noise assumptions. Our analysis provides insights -- which we validate through simulation -- into the intricate interactions between batch noise, stochastic gradient compression, and adaptivity in this modern theoretical setup. For instance, we show that \textit{adaptive} methods such as Distributed SignSGD can successfully converge under standard assumptions on the learning rate scheduler, even under heavy-tailed noise. On the contrary, Distributed (Compressed) SGD with pre-scheduled decaying learning rate fails to achieve convergence, unless such a schedule also accounts for an inverse dependency on the gradient norm -- de facto falling back into an adaptive method.

artificial intelligence, international conference, machine learning, (15 more...)

2506.00181

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Alden, Andrew, Horvath, Blanka, Issa, Zacharia

Signature Maximum Mean Discrepancy Two-Sample Statistical Tests

arXiv.org Machine LearningJun-3-2025

Maximum Mean Discrepancy (MMD) is a widely used concept in machine learning research which has gained popularity in recent years as a highly effective tool for comparing (finite-dimensional) distributions. Since it is designed as a kernel-based method, the MMD can be extended to path space valued distributions using the signature kernel. The resulting signature MMD (sig-MMD) can be used to define a metric between distributions on path space. Similarly to the original use case of the MMD as a test statistic within a two-sample testing framework, the sig-MMD can be applied to determine if two sets of paths are drawn from the same stochastic process. This work is dedicated to understanding the possibilities and challenges associated with applying the sig-MMD as a statistical tool in practice. We introduce and explain the sig-MMD, and provide easily accessible and verifiable examples for its practical use. We present examples that can lead to Type 2 errors in the hypothesis test, falsely indicating that samples have been drawn from the same underlying process (which generally occurs in a limited data setting). We then present techniques to mitigate the occurrence of this type of error.

artificial intelligence, machine learning, probability, (17 more...)

2506.01718

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)

Neural Information Processing SystemsMay-31-2025, 11:11:35 GMT

Active Labeling: Streaming Stochastic Gradients

The workhorse of machine learning is stochastic gradient descent.To access stochastic gradients, it is common to consider iteratively input/output pairs of a training dataset.Interestingly, it appears that one does not need full supervision to access stochastic gradients, which is the main motivation of this paper.After formalizing the "active labeling" problem, which focuses on active learning with partial supervision, we provide a streaming technique that provably minimizes the ratio of generalization error over the number of samples.We illustrate our technique in depth for robust regression.

access stochastic gradient, streaming stochastic gradient, supervision

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Gabriel, Franck, Ged, François, Veiga, Maria Han, Schertzer, Emmanuel

Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

arXiv.org Machine LearningMay-30-2025

Diffusion models now set the benchmark in high-fidelity generative sampling, yet they can, in principle, be prone to memorization. In this case, their learned score overfits the finite dataset so that the reverse-time SDE samples are mostly training points. In this paper, we interpret the empirical score as a noisy version of the true score and show that its covariance matrix is asymptotically a re-weighted data PCA. In large dimension, the small time limit makes the noise variance blow up while simultaneously reducing spatial correlation. To reduce this variance, we introduce a kernel-smoothed empirical score and analyze its bias-variance trade-off. We derive asymptotic bounds on the Kullback-Leibler divergence between the true distribution and the one generated by the modified reverse SDE. Regularization on the score has the same effect as increasing the size of the training dataset, and thus helps prevent memorization. A spectral decomposition of the forward diffusion suggests better variance control under some regularity conditions of the true data distribution. Reverse diffusion with kernel-smoothed empirical score can be reformulated as a gradient descent drifted toward a Log-Exponential Double-Kernel Density Estimator (LED-KDE). This perspective highlights two regularization mechanisms taking place in denoising diffusions: an initial Gaussian kernel first diffuses mass isotropically in the ambient space, while a second kernel applied in score space concentrates and spreads that mass along the data manifold. Hence, even a straightforward regularization-without any learning-already mitigates memorization and enhances generalization. Numerically, we illustrate our results with several experiments on synthetic and MNIST datasets.

artificial intelligence, empirical score, machine learning, (18 more...)

2505.22841

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)