AITopics | deep relative trust

Collaborating Authors

deep relative trust

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f4b31bee138ff5f7b84ce1575a738f95-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 02:52:34 GMT

international conference, neural network, perturbation, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

9a32ef65c42085537062753ec435750f-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 12:25:13 GMT

deep relative trust, madam, neural network, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)

Add feedback

9a32ef65c42085537062753ec435750f-AuthorFeedback.pdf

Neural Information Processing SystemsAug-22-2025, 00:32:07 GMT

deep relative trust, madam, neural network, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)

Add feedback

On the distance between two neural networks and the stability of learning Jeremy Bernstein Caltech

Neural Information Processing SystemsAug-17-2025, 06:57:56 GMT

Python code used in this paper is here: https://github.com/jxbz/fromage .

artificial intelligence, machine learning, neural network, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

9a32ef65c42085537062753ec435750f-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 08:30:12 GMT

epoch, initial learning rate, learning rate, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

9a32ef65c42085537062753ec435750f-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 08:30:04 GMT

logarithmic number system, madam, neural network, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning compositional functions via multiplicative weight updates

Bernstein, Jeremy, Zhao, Jiawei, Meister, Markus, Liu, Ming-Yu, Anandkumar, Anima, Yue, Yisong

arXiv.org Machine LearningJun-25-2020

Compositionality is a basic structural feature of both biological and artificial neural networks. Learning compositional functions via gradient descent incurs well known problems like vanishing and exploding gradients, making careful learning rate tuning essential for real-world applications. This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions. Based on this lemma, we derive Madam---a multiplicative version of the Adam optimiser---and show that it can train state of the art neural network architectures without learning rate tuning. We further show that Madam is easily adapted to train natively compressed neural networks by representing their weights in a logarithmic number system. We conclude by drawing connections between multiplicative weight updates and recent findings about synapses in biology.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2006.1456

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the distance between two neural networks and the stability of learning

Bernstein, Jeremy, Vahdat, Arash, Yue, Yisong, Liu, Ming-Yu

arXiv.org Machine LearningFeb-9-2020

How far apart are two neural networks? This is a foundational question in their theory. We derive a simple and tractable bound that relates distance in function space to distance in parameter space for a broad class of nonlinear compositional functions. The bound distills a clear dependence on depth of the composition. The theory is of practical relevance since it establishes a trust region for first-order optimisation. In turn, this suggests an optimiser that we call Frobenius matched gradient descent---or Fromage. Fromage involves a principled form of gradient rescaling and enjoys guarantees on stability of both the spectra and Frobenius norms of the weights. We find that the new algorithm increases the depth at which a multilayer perceptron may be trained as compared to Adam and SGD and is competitive with Adam for training generative adversarial networks. We further verify that Fromage scales up to a language transformer with over $10^8$ parameters. Please find code & reproducibility instructions at: https://github.com/jxbz/fromage.

matrix, multilayer perceptron, neural network, (15 more...)

arXiv.org Machine Learning

2002.03432

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.58)

Add feedback