AITopics | adabelief

Collaborating Authors

adabelief

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

eddea82ad2755b24c4e168c5fc2ebd40-Paper.pdf

Neural Information Processing SystemsApr-27-2026, 16:54:32 GMT

acprop, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

8f9d459c19b59b5400ce396e0f8c23e0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 20:41:39 GMT

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(13 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

eddea82ad2755b24c4e168c5fc2ebd40-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 19:07:15 GMT

acprop, arxiv preprint arxiv, optimizer, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

d9d4f495e875a2e075a1a4a6e1b9770f-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 16:31:29 GMT

adabelief, arxiv preprint arxiv, optimizer, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

8f9d459c19b59b5400ce396e0f8c23e0-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:29:44 GMT

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(13 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients Juntang Zhuang 1; Tommy T ang

Neural Information Processing SystemsAug-16-2025, 18:02:35 GMT

Adam) and accelerated schemes (e.g.

adabelief, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Response to R1

Neural Information Processing SystemsAug-16-2025, 18:02:23 GMT

We thank all reviewers for comments. We are glad to see our work commented as "promising"(R3), "effective"(R6), We address their concerns below. Writing We'll rephrase remarks, e.g."Examples give hints to local behavior of optimizers in deep learning". Q2.a Assumptions We list assumptions (1)-(3) as below: Numerically, we need a small β (e.g 0.3) or large t . We also tried default β with large t, results similar to Fig.3(d).

artificial intelligence, assumption, machine learning, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on AdamW Basis

Guan, Lei

arXiv.org Artificial IntelligenceDec-24-2023

This paper proposes an efficient optimizer called AdaPlus which integrates Nesterov momentum and precise stepsize adjustment on AdamW basis. AdaPlus combines the advantages of AdamW, Nadam, and AdaBelief and, in particular, does not introduce any extra hyper-parameters. We perform extensive experimental evaluations on three machine learning tasks to validate the effectiveness of AdaPlus. The experiment results validate that AdaPlus (i) among all the evaluated adaptive methods, performs most comparable with (even slightly better than) SGD with momentum on image classification tasks and (ii) outperforms other state-of-the-art optimizers on language modeling tasks and illustrates pretty high stability when training GANs. The experiment code of AdaPlus will be accessible at: https://github.com/guanleics/AdaPlus.

adaplus, adaptive method, optimizer, (14 more...)

arXiv.org Artificial Intelligence

2309.01966

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

AdamL: A fast adaptive gradient method incorporating loss function

Xia, Lu, Massei, Stefano

arXiv.org Machine LearningDec-23-2023

Adaptive first-order optimizers are fundamental tools in deep learning, although they may suffer from poor generalization due to the nonuniform gradient scaling. In this work, we propose AdamL, a novel variant of the Adam optimizer, that takes into account the loss function information to attain better generalization results. We provide sufficient conditions that together with the Polyak-Lojasiewicz inequality, ensure the linear convergence of AdamL. As a byproduct of our analysis, we prove similar convergence properties for the EAdam, and AdaBelief optimizers. Experimental results on benchmark functions show that AdamL typically achieves either the fastest convergence or the lowest objective function values when compared to Adam, EAdam, and AdaBelief. These superior performances are confirmed when considering deep learning tasks such as training convolutional neural networks, training generative adversarial networks using vanilla convolutional neural networks, and long short-term memory networks. Finally, in the case of vanilla convolutional neural networks, AdamL stands out from the other Adam's variants and does not require the manual adjustment of the learning rate during the later stage of the training.

artificial intelligence, machine learning, optimizer, (16 more...)

arXiv.org Machine Learning

2312.15295

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy (0.04)

Genre: Research Report > New Finding (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix

Yue, Yun, Ye, Zhiling, Jiang, Jiadi, Liu, Yongchao, Zhang, Ke

arXiv.org Artificial IntelligenceDec-4-2023

Adaptive optimizers, such as Adam, have achieved remarkable success in deep learning. A key component of these optimizers is the so-called preconditioning matrix, providing enhanced gradient information and regulating the step size of each gradient direction. In this paper, we propose a novel approach to designing the preconditioning matrix by utilizing the gradient difference between two successive steps as the diagonal elements. These diagonal elements are closely related to the Hessian and can be perceived as an approximation of the inner product between the Hessian row vectors and difference of the adjacent parameter vectors. Additionally, we introduce an auto-switching function that enables the preconditioning matrix to switch dynamically between Stochastic Gradient Descent (SGD) and the adaptive optimizer. Based on these two techniques, we develop a new optimizer named AGD that enhances the generalization performance. We evaluate AGD on public datasets of Natural Language Processing (NLP), Computer Vision (CV), and Recommendation Systems (RecSys). Our experimental results demonstrate that AGD outperforms the state-of-the-art (SOTA) optimizers, achieving highly competitive or significantly better predictive performance. Furthermore, we analyze how AGD is able to switch automatically between SGD and the adaptive optimizer and its actual effects on various scenarios.

agd, equation, optimizer, (17 more...)

arXiv.org Artificial Intelligence

2312.01658

Country: