AITopics | gradient lipschitz constant

Collaborating Authors

gradient lipschitz constant

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Entrapment Problem in Random Walk Decentralized Learning

Liu, Zonghong, Rouayheb, Salim El, Dwyer, Matthew

arXiv.org Artificial IntelligenceJul-30-2024

This paper explores decentralized learning in a graph-based setting, where data is distributed across nodes. We investigate a decentralized SGD algorithm that utilizes a random walk to update a global model based on local data. Our focus is on designing the transition probability matrix to speed up convergence. While importance sampling can enhance centralized learning, its decentralized counterpart, using the Metropolis-Hastings (MH) algorithm, can lead to the entrapment problem, where the random walk becomes stuck at certain nodes, slowing convergence. To address this, we propose the Metropolis-Hastings with L\'evy Jumps (MHLJ) algorithm, which incorporates random perturbations (jumps) to overcome entrapment. We theoretically establish the convergence rate and error gap of MHLJ and validate our findings through numerical experiments.

algorithm, random walk, transition probability, (12 more...)

arXiv.org Artificial Intelligence

2407.20611

Country:

North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > Maryland > Prince George's County > Adelphi (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Tune smarter not harder: A principled approach to tuning learning rates for shallow nets

Tholeti, Thulasi, Kalyani, Sheetal

arXiv.org Machine LearningMar-22-2020

Effective hyper-parameter tuning is essential to guarantee the performance that neural networks have come to be known for. In this work, a principled approach to choosing the learning rate is proposed for shallow feedforward neural networks. We associate the learning rate with the gradient Lipschitz constant of the objective to be minimized while training. An upper bound on the mentioned constant is derived and a search algorithm, which always results in non-divergent traces, is proposed to exploit the derived bound. It is shown through simulations that the proposed search method significantly outperforms the existing tuning methods such as Tree Parzen Estimators (TPE). The proposed method is applied to two different existing applications, namely, channel estimation in a wireless communication system and prediction of the exchange currency rates, and it is shown to pick better learning rates than the existing methods using the same or lesser compute power.

algorithm, gradient lipschitz constant, neural network, (14 more...)

arXiv.org Machine Learning

2003.09844

Country:

North America > United States > New York (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback