AITopics

2403.17364

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Artificial IntelligenceNov-27-2023

Improving Denoising Diffusion Probabilistic Models via Exploiting Shared Representations

Pirhayatifard, Delaram, Toghani, Mohammad Taha, Balakrishnan, Guha, Uribe, César A.

In this work, we address the challenge of multi-task image generation with limited data for denoising diffusion probabilistic models (DDPM), a class of generative models that produce high-quality images by reversing a noisy diffusion process. We propose a novel method, SR-DDPM, that leverages representation-based techniques from few-shot learning to effectively learn from fewer samples across different tasks. Our method consists of a core meta architecture with shared parameters, i.e., task-specific layers with exclusive parameters. By exploiting the similarity between diverse data distributions, our method can scale to multiple tasks without compromising the image quality. We evaluate our method on standard image datasets and show that it outperforms both unconditional and conditional DDPM in terms of FID and SSIM metrics.

artificial intelligence, arxiv preprint arxiv, machine learning, (11 more...)

2311.16353

Country: North America > United States (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.62)

arXiv.org Artificial IntelligenceOct-6-2023

Adaptive Federated Learning with Auto-Tuned Clients

Kim, Junhyung Lyle, Toghani, Mohammad Taha, Uribe, César A., Kyrillidis, Anastasios

Federated learning (FL) is a distributed machine learning framework where the global model of a central server is trained via multiple collaborative steps by participating clients without sharing their data. While being a flexible framework, where the distribution of local data, participation rate, and computing power of each client can greatly vary, such flexibility gives rise to many new challenges, especially in the hyperparameter tuning on the client side. We propose $\Delta$-SGD, a simple step size rule for SGD that enables each client to use its own step size by adapting to the local smoothness of the function each client is optimizing. We provide theoretical and empirical results where the benefit of the client adaptivity is shown in various FL scenarios.

artificial intelligence, machine learning, optimization problem, (15 more...)

2306.11201

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

arXiv.org Machine LearningOct-4-2023

PersA-FL: Personalized Asynchronous Federated Learning

Toghani, Mohammad Taha, Lee, Soomin, Uribe, César A.

We study the personalized federated learning problem under asynchronous updates. In this problem, each client seeks to obtain a personalized model that simultaneously outperforms local and global models. We consider two optimization-based frameworks for personalization: (i) Model-Agnostic Meta-Learning (MAML) and (ii) Moreau Envelope (ME). MAML involves learning a joint model adapted for each client through fine-tuning, whereas ME requires a bi-level optimization problem with implicit gradients to enforce personalization via regularized losses. We focus on improving the scalability of personalized federated learning by removing the synchronous communication assumption. Moreover, we extend the studied function class by removing boundedness assumptions on the gradient norm. Our main technical contribution is a unified proof for asynchronous federated learning with bounded staleness that we apply to MAML and ME personalization frameworks. For the smooth and non-convex functions class, we show the convergence of our method to a first-order stationary point. We illustrate the performance of our method and its tolerance to staleness through experiments for classification tasks over heterogeneous datasets.

artificial intelligence, federated learning, machine learning, (15 more...)

2210.01176

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

arXiv.org Artificial IntelligenceMay-20-2023

On First-Order Meta-Reinforcement Learning with Moreau Envelopes

Toghani, Mohammad Taha, Perez-Salazar, Sebastian, Uribe, César A.

Meta-Reinforcement Learning (MRL) is a promising framework for training agents that can quickly adapt to new environments and tasks. In this work, we study the MRL problem under the policy gradient formulation, where we propose a novel algorithm that uses Moreau envelope surrogate regularizers to jointly learn a meta-policy that is adjustable to the environment of each individual task. Our algorithm, called Moreau Envelope Meta-Reinforcement Learning (MEMRL), learns a meta-policy that can adapt to a distribution of tasks by efficiently updating the policy parameters using a combination of gradient-based optimization and Moreau Envelope regularization. Moreau Envelopes provide a smooth approximation of the policy optimization problem, which enables us to apply standard optimization techniques and converge to an appropriate stationary point. We provide a detailed analysis of the MEMRL algorithm, where we show a sublinear convergence rate to a first-order stationary point for non-convex policy gradient optimization. We finally show the effectiveness of MEMRL on a multi-task 2D-navigation problem.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2305.12216

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJul-2-2021

Momentum-inspired Low-Rank Coordinate Descent for Diagonally Constrained SDPs

Kim, Junhyung Lyle, Benitez, Jose Antonio Lara, Toghani, Mohammad Taha, Wolfe, Cameron, Zhang, Zhiwei, Kyrillidis, Anastasios

We present a novel, practical, and provable approach for solving diagonally constrained semi-definite programming (SDP) problems at scale using accelerated non-convex programming. Our algorithm non-trivially combines acceleration motions from convex optimization with coordinate power iteration and matrix factorization techniques. The algorithm is extremely simple to implement, and adds only a single extra hyperparameter -- momentum. We prove that our method admits local linear convergence in the neighborhood of the optimum and always converges to a first-order critical point. Experimentally, we showcase the merits of our method on three major application domains: MaxCut, MaxSAT, and MIMO signal detection. In all cases, our methodology provides significant speedups over non-convex and convex SDP solvers -- 5X faster than state-of-the-art non-convex solvers, and 9 to 10^3 X faster than convex SDP solvers -- with comparable or improved solution quality.

artificial intelligence, mixing method, optimization problem, (14 more...)

2106.08775

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningFeb-14-2021

Communication-Efficient Distributed Cooperative Learning with Compressed Beliefs

Toghani, Mohammad Taha, Uribe, Cesar A.

We study the problem of distributed cooperative learning, where a group of agents seek to agree on a set of hypotheses that best describes a sequence of private observations. In the scenario where the set of hypotheses is large, we propose a belief update rule where agents share compressed (either sparse or quantized) beliefs with an arbitrary positive compression rate. Our algorithm leverages a unified and straightforward communication rule that enables agents to access wide-ranging compression operators as black-box modules. We prove the almost sure asymptotic exponential convergence of beliefs around the set of optimal hypotheses. Additionally, we show a non-asymptotic, explicit, and linear concentration rate in probability of the beliefs on the optimal hypothesis set. We provide numerical experiments to illustrate the communication benefits of our method. The simulation results show that the number of transmitted bits can be reduced to 5-10% of the non-compressed method in the studied scenarios.

artificial intelligence, bayesian inference, hypothesis, (20 more...)

2102.07767

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningNov-13-2020

MP-Boost: Minipatch Boosting via Adaptive Feature and Observation Sampling

Toghani, Mohammad Taha, Allen, Genevera I.

Boosting methods are among the best general-purpose and off-the-shelf machine learning approaches, gaining widespread popularity. In this paper, we seek to develop a boosting method that yields comparable accuracy to popular AdaBoost and gradient boosting methods, yet is faster computationally and whose solution is more interpretable. We achieve this by developing MP-Boost, an algorithm loosely based on AdaBoost that learns by adaptively selecting small subsets of instances and features, or what we term minipatches (MP), at each iteration. By sequentially learning on tiny subsets of the data, our approach is computationally faster than other classic boosting algorithms. Also as it progresses, MP-Boost adaptively learns a probability distribution on the features and instances that upweight the most important features and challenging instances, hence adaptively selecting the most relevant minipatches for learning. These learned probability distributions also aid in interpretation of our method. We empirically demonstrate the interpretability, comparative accuracy, and computational time of our approach on a variety of binary classification tasks.

algorithm, artificial intelligence, machine learning, (16 more...)

2011.07218

Country: North America > United States (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.93)