AITopics | Choudhury, Sayantan

Collaborating Authors

Choudhury, Sayantan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multiplayer Federated Learning: Reaching Equilibrium with Less Communication

Yoon, TaeHo, Choudhury, Sayantan, Loizou, Nicolas

arXiv.org Machine LearningJan-14-2025

Federated Learning (FL) has emerged as a powerful collaborative learning paradigm where multiple clients jointly train a machine learning model without sharing their local data. In the classical FL setting, a central server coordinates multiple clients (e.g., mobile devices, edge devices) to collaboratively learn a shared global model without exchanging their own training data [48, 54, 79, 64]. In this scenario, each client performs local computations on its private data and periodically communicates model updates to the server, which aggregates them to update the global model. This collaborative approach has been successfully applied in various domains, including natural language processing [69, 43], computer vision [70, 63], and healthcare [4, 116]. Despite their success, traditional FL frameworks rely on the key assumption that all participants are fully cooperative and share aligned objectives, collectively working towards optimizing the performance of a shared global model (e.g., minimizing the average of individual loss functions). This assumption overlooks situations where participants have individual objectives, or competitive interests that may not align with the collective goal. Diverse examples of such scenarios have been extensively considered in the game theory literature, including Cournot competition in economics [2], optical networks [91], electricity markets [98], energy consumption control in smart grid [120], or mobile robot control [49]. Despite their relevance, these applications have yet to be associated with FL, presenting an unexplored opportunity to bridge game theory and FL for more robust and realistic frameworks.

artificial intelligence, machine learning, pearl-sgd, (12 more...)

arXiv.org Machine Learning

2501.08263

Country: Europe > United Kingdom > England (0.14)

Genre:

Research Report (0.64)
Overview (0.45)

Industry:

Leisure & Entertainment > Games (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity

Gorbunov, Eduard, Tupitsa, Nazarii, Choudhury, Sayantan, Aliev, Alen, Richtárik, Peter, Horváth, Samuel, Takáč, Martin

arXiv.org Artificial IntelligenceDec-25-2024

Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al., 2020). In this paper, we focus on the class of (strongly) convex $(L_0,L_1)$-smooth functions and derive new convergence guarantees for several existing methods. In particular, we derive improved convergence rates for Gradient Descent with (Smoothed) Gradient Clipping and for Gradient Descent with Polyak Stepsizes. In contrast to the existing results, our rates do not rely on the standard smoothness assumption and do not suffer from the exponential dependency from the initial distance to the solution. We also extend these results to the stochastic case under the over-parameterization assumption, propose a new accelerated method for convex $(L_0,L_1)$-smooth optimization, and derive new convergence rates for Adaptive Gradient Descent (Malitsky and Mishchenko, 2020).

artificial intelligence, assumption 1, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2409.14989

Country:

Europe (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad

Choudhury, Sayantan, Tupitsa, Nazarii, Loizou, Nicolas, Horvath, Samuel, Takac, Martin, Gorbunov, Eduard

arXiv.org Artificial IntelligenceJun-5-2024

Adaptive methods are extremely popular in machine learning as they make learning rate tuning less expensive. This paper introduces a novel optimization algorithm named KATE, which presents a scale-invariant adaptation of the well-known AdaGrad algorithm. We prove the scale-invariance of KATE for the case of Generalized Linear Models. Moreover, for general smooth non-convex problems, we establish a convergence rate of $O \left(\frac{\log T}{\sqrt{T}} \right)$ for KATE, matching the best-known ones for AdaGrad and Adam. We also compare KATE to other state-of-the-art adaptive algorithms Adam and AdaGrad in numerical experiments with different problems, including complex machine learning tasks like image classification and text classification on real data. The results indicate that KATE consistently outperforms AdaGrad and matches/surpasses the performance of Adam in all considered scenarios.

adagrad, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.02648

Country: Europe > Belgium (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker Conditions

Choudhury, Sayantan, Gorbunov, Eduard, Loizou, Nicolas

arXiv.org Machine LearningNov-12-2023

Single-call stochastic extragradient methods, like stochastic past extragradient (SPEG) and stochastic optimistic gradient (SOG), have gained a lot of interest in recent years and are one of the most efficient algorithms for solving large-scale min-max optimization and variational inequalities problems (VIP) appearing in various machine learning tasks. However, despite their undoubted popularity, current convergence analyses of SPEG and SOG require a bounded variance assumption. In addition, several important questions regarding the convergence properties of these methods are still open, including mini-batching, efficient step-size selection, and convergence guarantees under different sampling strategies. In this work, we address these questions and provide convergence guarantees for two large classes of structured non-monotone VIPs: (i) quasi-strongly monotone problems (a generalization of strongly monotone problems) and (ii) weak Minty variational inequalities (a generalization of monotone and Minty VIPs). We introduce the expected residual condition, explain its benefits, and show how it can be used to obtain a strictly weaker bound than previously used growth conditions, expected co-coercivity, or bounded variance assumptions. Equipped with this condition, we provide theoretical guarantees for the convergence of single-call extragradient methods for different step-size selections, including constant, decreasing, and step-size-switching rules. Furthermore, our convergence analysis holds under the arbitrary sampling paradigm, which includes importance sampling and various mini-batching strategies as special cases.

artificial intelligence, machine learning, speg, (16 more...)

arXiv.org Machine Learning

2302.14043

Country:

Asia (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates

Zhang, Siqi, Choudhury, Sayantan, Stich, Sebastian U, Loizou, Nicolas

arXiv.org Artificial IntelligenceJun-8-2023

Distributed and federated learning algorithms and techniques associated primarily with minimization problems. However, with the increase of minimax optimization and variational inequality problems in machine learning, the necessity of designing efficient distributed/federated learning approaches for these problems is becoming more apparent. In this paper, we provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs). Our approach is based on a general key assumption on the stochastic estimates that allows us to propose and analyze several novel local training algorithms under a single framework for solving a class of structured non-monotone VIPs. We present the first local gradient descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data. The general algorithmic framework recovers state-of-the-art algorithms and their sharp convergence guarantees when the setting is specialized to minimization or minimax optimization problems. Finally, we demonstrate the strong performance of the proposed algorithms compared to state-of-the-art methods when solving federated minimax optimization problems.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.051

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.54)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)

Add feedback