AITopics | onvergence

Collaborating Authors

onvergence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convergence for Discrete Parameter Update Schemes

Wilson, Paul, Zanasi, Fabio, Constantinides, George

arXiv.org Artificial IntelligenceDec-8-2025

Modern deep learning models require immense computational resources, motivating research into low-precision training. Quantised training addresses this by representing training components in low-bit integers, but typically relies on discretising real-valued updates. We introduce an alternative approach where the update rule itself is discrete, avoiding the quantisation of continuous updates by design. We establish convergence guarantees for a general class of such discrete schemes, and present a multinomial update rule as a concrete example, supported by empirical evaluation. This perspective opens new avenues for efficient training, particularly for models with inherently discrete structure.

artificial intelligence, machine learning, zhang, (15 more...)

arXiv.org Artificial Intelligence

2512.04051

Genre: Research Report (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Asymptotic and Non-Asymptotic Convergence Analysis of AdaGrad for Non-Convex Optimization via Novel Stopping Time-based Analysis

Jin, Ruinan, Wang, Xiaoyu, Wang, Baoxiang

arXiv.org Machine LearningSep-8-2024

Adaptive optimizers have emerged as powerful tools in deep learning, dynamically adjusting the learning rate based on iterative gradients. These adaptive methods have significantly succeeded in various deep learning tasks, outperforming stochastic gradient descent (SGD). However, although AdaGrad is a cornerstone adaptive optimizer, its theoretical analysis is inadequate in addressing asymptotic convergence and non-asymptotic convergence rates on non-convex optimization. This study aims to provide a comprehensive analysis and complete picture of AdaGrad. We first introduce a novel stopping time technique from probabilistic theory to establish stability for the norm version of AdaGrad under milder conditions. We further derive two forms of asymptotic convergence: almost sure and mean-square. Furthermore, we demonstrate the near-optimal non-asymptotic convergence rate measured by the average-squared gradients in expectation, which is rarely explored and stronger than the existing high-probability results, under the mild assumptions. The techniques developed in this work are potentially independent of interest for future research on other adaptive stochastic algorithms.

assumption 2, equation, inequality, (15 more...)

arXiv.org Machine Learning

2409.05023

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

A Structural Complexity Analysis of Synchronous Dynamical Systems

Eiben, Eduard, Ganian, Robert, Hamm, Thekla, Korchemna, Viktoriia

arXiv.org Artificial IntelligenceDec-12-2023

Synchronous dynamic systems are well-established models that have been used to capture a range of phenomena in networks, including opinion diffusion, spread of disease and product adoption. We study the three most notable problems in synchronous dynamic systems: whether the system will transition to a target configuration from a starting configuration, whether the system will reach convergence from a starting configuration, and whether the system is guaranteed to converge from every possible starting configuration. While all three problems were known to be intractable in the classical sense, we initiate the study of their exact boundaries of tractability from the perspective of structural parameters of the network by making use of the more fine-grained parameterized complexity paradigm. As our first result, we consider treewidth - as the most prominent and ubiquitous structural parameter - and show that all three problems remain intractable even on instances of constant treewidth. We complement this negative finding with fixed-parameter algorithms for the former two problems parameterized by treedepth, a well-studied restriction of treewidth. While it is possible to rule out a similar algorithm for convergence guarantee under treedepth, we conclude with a fixed-parameter algorithm for this last problem when parameterized by treedepth and the maximum in-degree.

configuration, node, treewidth, (15 more...)

arXiv.org Artificial Intelligence

2312.08385

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
South America > Brazil > São Paulo (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games

Wei, Chen-Yu, Lee, Chung-Wei, Zhang, Mengxiao, Luo, Haipeng

arXiv.org Artificial IntelligenceFeb-8-2021

We study infinite-horizon discounted two-player zero-sum Markov games, and develop a decentralized algorithm that provably converges to the set of Nash equilibria under self-play. Our algorithm is based on running an Optimistic Gradient Descent Ascent algorithm on each state to learn the policies, with a critic that slowly learns the value of each state. To the best of our knowledge, this is the first algorithm in this setting that is simultaneously rational (converging to the opponent's best response when it uses a stationary policy), convergent (converging to the set of Nash equilibria under self-play), agnostic (no need to know the actions played by the opponent), symmetric (players taking symmetric roles in the algorithm), and enjoying a finite-time last-iterate convergence guarantee, all of which are desirable properties of decentralized algorithms.

algorithm, dist 2, onvergence, (16 more...)

arXiv.org Artificial Intelligence

2102.0454

Country:

North America > United States > California (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

Robust Deep Learning as Optimal Control: Insights and Convergence Guarantees

Seidman, Jacob H., Fazlyab, Mahyar, Preciado, Victor M., Pappas, George J.

arXiv.org Machine LearningMay-1-2020

The fragility of deep neural networks to adversarially-chosen inputs has motivated the need to revisit deep learning algorithms. Including adversarial examples during training is a popular defense mechanism against adversarial attacks. This mechanism can be formulated as a min-max optimization problem, where the adversary seeks to maximize the loss function using an iterative first-order algorithm while the learner attempts to minimize it. However, finding adversarial examples in this way causes excessive computational overhead during training. By interpreting the min-max problem as an optimal control problem, it has recently been shown that one can exploit the compositional structure of neural networks in the optimization problem to improve the training time significantly. In this paper, we provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact oracle methods in optimization. Our analysis sheds light on how the hyperparameters of the algorithm affect the its stability and convergence. We support our insights with experiments on a robust classification problem.

adversary, algorithm, neural network, (16 more...)

arXiv.org Machine Learning

2005.00616

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback