AITopics

Federated Continual Learning (FCL) has emerged as a robust solution for collaborative model training in dynamic environments, where data samples are continuously generated and distributed across multiple devices. This survey provides a comprehensive review of FCL, focusing on key challenges such as heterogeneity, model stability, communication overhead, and privacy preservation. We explore various forms of heterogeneity and their impact on model performance. Solutions to non-IID data, resource-constrained platforms, and personalized learning are reviewed in an effort to show the complexities of handling heterogeneous data distributions. Next, we review techniques for ensuring model stability and avoiding catastrophic forgetting, which are critical in non-stationary environments. Privacy-preserving techniques are another aspect of FCL that have been reviewed in this work. This survey has integrated insights from federated learning and continual learning to present strategies for improving the efficacy and scalability of FCL systems, making it applicable to a wide range of real-world scenarios.

data mining, knowledge management, machine learning, (23 more...)

2502.07059

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)
(11 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Knowledge Management (1.00)
(11 more...)

Hsu, Cyril Shih-Huan, Papagianni, Chrysa, Grosso, Paola

RAILS: Risk-Aware Iterated Local Search for Joint SLA Decomposition and Service Provider Management in Multi-Domain Networks

The emergence of the fifth generation (5G) technology has transformed mobile networks into multi-service environments, necessitating efficient network slicing to meet diverse Service Level Agreements (SLAs). SLA decomposition across multiple network domains, each potentially managed by different service providers, poses a significant challenge due to limited visibility into real-time underlying domain conditions. This paper introduces Risk-Aware Iterated Local Search (RAILS), a novel risk model-driven meta-heuristic framework designed to jointly address SLA decomposition and service provider selection in multi-domain networks. By integrating online risk modeling with iterated local search principles, RAILS effectively navigates the complex optimization landscape, utilizing historical feedback from domain controllers. We formulate the joint problem as a Mixed-Integer Nonlinear Programming (MINLP) problem and prove its NP-hardness. Extensive simulations demonstrate that RAILS achieves near-optimal performance, offering an efficient, real-time solution for adaptive SLA management in modern multi-domain networks.

artificial intelligence, optimization problem, provider, (13 more...)

2502.06674

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Decentralized Inference for Spatial Data Using Low-Rank Models

Shi, Jianwei, Abdulah, Sameh, Sun, Ying, Genton, Marc G.

Advancements in information technology have enabled the creation of massive spatial datasets, driving the need for scalable and efficient computational methodologies. While offering viable solutions, centralized frameworks are limited by vulnerabilities such as single-point failures and communication bottlenecks. This paper presents a decentralized framework tailored for parameter inference in spatial low-rank models to address these challenges. A key obstacle arises from the spatial dependence among observations, which prevents the log-likelihood from being expressed as a summation-a critical requirement for decentralized optimization approaches. To overcome this challenge, we propose a novel objective function leveraging the evidence lower bound, which facilitates the use of decentralized optimization techniques. Our approach employs a block descent method integrated with multi-consensus and dynamic consensus averaging for effective parameter optimization. We prove the convexity of the new objective function in the vicinity of the true parameters, ensuring the convergence of the proposed method. Additionally, we present the first theoretical results establishing the consistency and asymptotic normality of the estimator within the context of spatial low-rank models. Extensive simulations and real-world data experiments corroborate these theoretical findings, showcasing the robustness and scalability of the framework.

artificial intelligence, bayesian inference, machine learning, (21 more...)

2502.00309

Country:

North America > United States (0.45)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Saudi Arabia (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Vauthier, Christophe, Mérigot, Quentin, Korba, Anna

Properties of Wasserstein Gradient Flows for the Sliced-Wasserstein Distance

In this paper, we investigate the properties of the Sliced Wasserstein Distance (SW) when employed as an objective functional. The SW metric has gained significant interest in the optimal transport and machine learning literature, due to its ability to capture intricate geometric properties of probability distributions while remaining computationally tractable, making it a valuable tool for various applications, including generative modeling and domain adaptation. Our study aims to provide a rigorous analysis of the critical points arising from the optimization of the SW objective. By computing explicit perturbations, we establish that stable critical points of SW cannot concentrate on segments. This stability analysis is crucial for understanding the behaviour of optimization algorithms for models trained using the SW objective. Furthermore, we investigate the properties of the SW objective, shedding light on the existence and convergence behavior of critical points. We illustrate our theoretical results through numerical experiments.

artificial intelligence, critical point, machine learning, (17 more...)

2502.06525

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Bayesian Optimization for Building Social-Influence-Free Consensus

Adachi, Masaki, Chau, Siu Lun, Xu, Wenjie, Singh, Anurag, Osborne, Michael A., Muandet, Krikamol

We introduce Social Bayesian Optimization (SBO), a vote-efficient algorithm for consensus-building in collective decision-making. In contrast to single-agent scenarios, collective decision-making encompasses group dynamics that may distort agents' preference feedback, thereby impeding their capacity to achieve a social-influence-free consensus -- the most preferable decision based on the aggregated agent utilities. We demonstrate that under mild rationality axioms, reaching social-influence-free consensus using noisy feedback alone is impossible. To address this, SBO employs a dual voting system: cheap but noisy public votes (e.g., show of hands in a meeting), and more accurate, though expensive, private votes (e.g., one-to-one interview). We model social influence using an unknown social graph and leverage the dual voting system to efficiently learn this graph. Our theoretical findigns show that social graph estimation converges faster than the black-box estimation of agents' utilities, allowing us to reduce reliance on costly private votes early in the process. This enables efficient consensus-building primarily through noisy public votes, which are debiased based on the estimated social graph to infer social-influence-free feedback. We validate the efficacy of SBO across multiple real-world applications, including thermal comfort, team building, travel negotiation, and energy trading collaboration.

artificial intelligence, consensus, machine learning, (20 more...)

2502.07166

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Oceania > New Zealand (0.04)
(6 more...)

Genre: Research Report (0.81)

Industry: Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Diverse Preference Optimization

Lanchantin, Jack, Chen, Angelica, Dhuliawala, Shehzaad, Yu, Ping, Weston, Jason, Sukhbaatar, Sainbayar, Kulikov, Ilia

Post-training of language models, either through reinforcement learning, preference optimization or supervised finetuning, tends to sharpen the output probability distribution and reduce the diversity of generated responses. This is particularly a problem for creative generative tasks where varied responses are desired. In this work we introduce Diverse Preference Optimization (DivPO), an optimization method which learns to generate much more diverse responses than standard pipelines, while maintaining the quality of the generations. In DivPO, preference pairs are selected by first considering a pool of responses, and a measure of diversity among them, and selecting chosen examples as being more rare but high quality, while rejected examples are more common, but low quality. DivPO results in generating 45.6% more diverse persona attributes, and an 74.6% increase in story diversity, while maintaining similar win rates as standard baselines.

large language model, machine learning, natural language, (16 more...)

2501.18101

Country:

Asia > Singapore (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Building Rome with Convex Optimization

Han, Haoyu, Yang, Heng

Global bundle adjustment is made easy by depth prediction and convex optimization. We (i) propose a scaled bundle adjustment (SBA) formulation that lifts 2D keypoint measurements to 3D with learned depth, (ii) design an empirically tight convex semidfinite program (SDP) relaxation that solves SBA to certfiable global optimality, (iii) solve the SDP relaxations at extreme scale with Burer-Monteiro factorization and a CUDA-based trust-region Riemannian optimizer (dubbed XM), (iv) build a structure from motion (SfM) pipeline with XM as the optimization engine and show that XM-SfM dominates or compares favorably with existing SfM pipelines in terms of reconstruction quality while being faster, more scalable, and initialization-free.

artificial intelligence, dataset, optimization problem, (14 more...)

2502.0464

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Online Covariance Matrix Estimation in Sketched Newton Methods

Kuang, Wei, Anitescu, Mihai, Na, Sen

Given the ubiquity of streaming data, online algorithms have been widely used for parameter estimation, with second-order methods particularly standing out for their efficiency and robustness. In this paper, we study an online sketched Newton method that leverages a randomized sketching technique to perform an approximate Newton step in each iteration, thereby eliminating the computational bottleneck of second-order methods. While existing studies have established the asymptotic normality of sketched Newton methods, a consistent estimator of the limiting covariance matrix remains an open problem. We propose a fully online covariance matrix estimator that is constructed entirely from the Newton iterates and requires no matrix factorization. Compared to covariance estimators for first-order online methods, our estimator for second-order methods is batch-free. We establish the consistency and convergence rate of our estimator, and coupled with asymptotic normality results, we can then perform online statistical inference for the model parameters based on sketched Newton methods. We also discuss the extension of our estimator to constrained problems, and demonstrate its superior performance on regression problems as well as benchmark problems in the CUTEst set.

artificial intelligence, estimator, machine learning, (19 more...)

2502.07114

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent

Sheshukova, Marina, Samsonov, Sergey, Belomestny, Denis, Moulines, Eric, Shao, Qi-Man, Zhang, Zhuo-Song, Naumov, Alexey

In this paper, we establish non-asymptotic convergence rates in the central limit theorem for Polyak-Ruppert-averaged iterates of stochastic gradient descent (SGD). Our analysis builds on the result of the Gaussian approximation for nonlinear statistics of independent random variables of Shao and Zhang (2022). Using this result, we prove the non-asymptotic validity of the multiplier bootstrap for constructing the confidence sets for the optimal solution of an optimization problem. In particular, our approach avoids the need to approximate the limiting covariance of Polyak-Ruppert SGD iterates, which allows us to derive approximation rates in convex distance of order up to $1/\sqrt{n}$.

artificial intelligence, inequality, machine learning, (17 more...)

2502.06719

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Luo, Lingkun, Hu, Shiqiang, Chen, Liming

Beyond Batch Learning: Global Awareness Enhanced Domain Adaptation

In domain adaptation (DA), the effectiveness of deep learning-based models is often constrained by batch learning strategies that fail to fully apprehend the global statistical and geometric characteristics of data distributions. Addressing this gap, we introduce 'Global Awareness Enhanced Domain Adaptation' (GAN-DA), a novel approach that transcends traditional batch-based limitations. GAN-DA integrates a unique predefined feature representation (PFR) to facilitate the alignment of cross-domain distributions, thereby achieving a comprehensive global statistical awareness. This representation is innovatively expanded to encompass orthogonal and common feature aspects, which enhances the unification of global manifold structures and refines decision boundaries for more effective DA. Our extensive experiments, encompassing 27 diverse cross-domain image classification tasks, demonstrate GAN-DA's remarkable superiority, outperforming 24 established DA methods by a significant margin. Furthermore, our in-depth analyses shed light on the decision-making processes, revealing insights into the adaptability and efficiency of GAN-DA. This approach not only addresses the limitations of existing DA methodologies but also sets a new benchmark in the realm of domain adaptation, offering broad implications for future research and applications in this field.

artificial intelligence, domain adaptation, machine learning, (18 more...)

2502.06272

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)