AITopics | proofof

Collaborating Authors

proofof

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Zeroth-Order Optimization at the Edge of Stability

Song, Minhak, Zhang, Liang, Li, Bingcong, He, Niao, Muehlebach, Michael, Oh, Sewoong

arXiv.org Machine LearningApr-17-2026

Zeroth-order (ZO) methods are widely used when gradients are unavailable or prohibitively expensive, including black-box learning and memory-efficient fine-tuning of large models, yet their optimization dynamics in deep learning remain underexplored. In this work, we provide an explicit step size condition that exactly captures the (mean-square) linear stability of a family of ZO methods based on the standard two-point estimator. Our characterization reveals a sharp contrast with first-order (FO) methods: whereas FO stability is governed solely by the largest Hessian eigenvalue, mean-square stability of ZO methods depends on the entire Hessian spectrum. Since computing the full Hessian spectrum is infeasible in practical neural network training, we further derive tractable stability bounds that depend only on the largest eigenvalue and the Hessian trace. Empirically, we find that full-batch ZO methods operate at the edge of stability: ZO-GD, ZO-GDM, and ZO-Adam consistently stabilize near the predicted stability boundary across a range of deep learning training problems. Our results highlight an implicit regularization effect specific to ZO methods, where large step sizes primarily regularize the Hessian trace, whereas in FO methods they regularize the top eigenvalue.

artificial intelligence, machine learning, stability, (18 more...)

arXiv.org Machine Learning

2604.14669

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

24ec8468b67314c2013d215b77034476-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 21:52:34 GMT

Deep neural networks have been successfully trained with simple gradient-based methods, despite the inherent non-convexity of the objectivefunction.

artificial intelligence, kwe, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.04)
North America > United States > Illinois (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Preferential and Preferential-discriminative Consequence relations

Ben-Naim, Jonathan

arXiv.org Artificial IntelligenceDec-1-2009

The present paper investigates consequence relations that are both non-monotonic and paraconsistent. More precisely, we put the focus on preferential consequence relations, i.e. those relations that can be defined by a binary preference relation on states labelled by valuations. We worked with a general notion of valuation that covers e.g. the classical valuations as well as certain kinds of many-valued valuations. In the many-valued cases, preferential consequence relations are paraconsistant (in addition to be non-monotonic), i.e. they are capable of drawing reasonable conclusions which contain contradictions. The first purpose of this paper is to provide in our general framework syntactic characterizations of several families of preferential relations. The second and main purpose is to provide, again in our general framework, characterizations of several families of preferential discriminative consequence relations. They are defined exactly as the plain version, but any conclusion such that its negation is also a conclusion is rejected (these relations bring something new essentially in the many-valued cases).

artificial intelligence, inaddition, proofof, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/logcom/exi013

cs/0506030

Country:

Europe > France (0.40)
North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback