AITopics | resetting

Collaborating Authors

resetting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Resetting the Optimizer in Deep RL: An Empirical Study

Neural Information Processing SystemsDec-27-2025, 01:24:22 GMT

We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain their own internal parameters such as estimates of the first-order and the second-order moments of the gradient, and update them over time. Therefore, information obtained in previous iterations is used to solve the optimization problem in the current iteration. We demonstrate that this can contaminate the moment estimates because the optimization landscape can change arbitrarily from one iteration to the next one. To hedge against this negative effect, a simple idea is to reset the internal parameters of the optimizer when starting a new iteration. We empirically investigate this resetting idea by employing various optimizers in conjunction with the Rainbow algorithm. We demonstrate that this simple modification significantly improves the performance of deep RL on the Atari benchmark.

iteration, name change, resetting, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Reviews: Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Neural Information Processing SystemsJan-25-2025, 20:42:30 GMT

The paper proposes a method for improving convergence rates of RL algorithms when one has access to a set of state-only expert demonstrations. The method works by modifying the given MDP so that the episode terminates whenever the agent leaves the set of states that had high-probability under the expert demonstrations. The paper then proves an upper bound on the regret incurred using their algorithm (as compared to the expert) in terms of the regret for the RL algorithm that is used to solve the modified MDP. The paper presents a set of experiments showing that the proposed mechanism can effectively strike a tradeoff between convergence rate and optimality. The clarity of the exposition is quite high, and the paper is easy to follow.

algorithm, emergency stop mechanism, expert demonstration, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

Resetting the Optimizer in Deep RL: An Empirical Study

Neural Information Processing SystemsJan-20-2025, 01:04:59 GMT

empirical study, iteration, optimizer, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Mishchenko, Konstantin, Defazio, Aaron

arXiv.org Machine LearningOct-29-2023

We consider the problem of estimating the learning rate in adaptive methods, such as Adagrad and Adam. We describe two techniques, Prodigy and Resetting, to provably estimate the distance to the solution $D$, which is needed to set the learning rate optimally. Our techniques are modifications of the D-Adaptation method for learning-rate-free learning. Our methods improve upon the convergence rate of D-Adaptation by a factor of $O(\sqrt{\log(D/d_0)})$, where $d_0$ is the initial estimate of $D$. We test our methods on 12 common logistic-regression benchmark datasets, VGG11 and ResNet-50 training on CIFAR10, ViT training on Imagenet, LSTM training on IWSLT14, DLRM training on Criteo dataset, VarNet on Knee MRI dataset, as well as RoBERTa and GPT transformer training on BookWiki. Our experimental results show that our approaches consistently outperform D-Adaptation and reach test accuracy values close to that of hand-tuned Adam.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

2306.06101

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving

Si, Xujie, Li, Yujia, Nair, Vinod, Gimeno, Felix

arXiv.org Artificial IntelligenceDec-4-2019

We propose prioritized unit propagation with periodic resetting, which is a simple but surprisingly effective algorithm for solving random SAT instances that are meant to be hard. In particular, an evaluation on the Random Track of the 2017 and 2018 SAT competitions shows that a basic prototype of this simple idea already ranks at second place in both years. We share this observation in the hope that it helps the SAT community better understand the hardness of random instances used in competitions and inspire other interesting ideas on SAT solving.

assignment, propagation, unit propagation, (14 more...)

arXiv.org Artificial Intelligence

1912.05906

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.60)

Add feedback