AITopics | rgen schmidhuber

Collaborating Authors

rgen schmidhuber

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3f9e3767ef3b10a0de4c256d7ef9805d-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 14:23:25 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California > Los Angeles County (0.28)

Industry:

Education (0.46)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

d87ca511e2a8593c8039ef732f5bffed-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 10:16:29 GMT

Weprovethat under mild conditions there is no vanishing or exploding gradient in training SBO-RNN.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

3f9e3767ef3b10a0de4c256d7ef9805d-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 08:44:53 GMT

linear transformer, preprint arxiv, transformer, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(18 more...)

Industry:

Education (0.46)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Modeling with Recurrent Highway Hypernetworks

Joseph Suarez

Neural Information Processing SystemsNov-21-2025, 13:58:05 GMT

Neural Information Processing Systems http://nips.cc/

hypernetwork, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Add feedback

Training Very Deep Networks

Rupesh K. Srivastava, Klaus Greff, Jürgen Schmidhuber

Neural Information Processing SystemsOct-2-2025, 01:23:18 GMT

Neural Information Processing Systems http://nips.cc/

highway network, neural network, transform gate, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

332b4fbe322e11a71fa39d91c664d8fa-Paper-Conference.pdf

Neural Information Processing SystemsAug-21-2025, 23:58:10 GMT

international conference, learning, plasticity, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

cba76ef96c4cd625631ab4d33285b045-Paper-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 23:27:03 GMT

artificial intelligence, machine learning, representation, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph

Irie, Kazuki

arXiv.org Artificial IntelligenceDec-31-2024

Do autoregressive Transformer language models require explicit positional encodings (PEs)? The answer is "no" as long as they have more than one layer -- they can distinguish sequences with permuted tokens without requiring explicit PEs. This property has been known since early efforts (those contemporary with GPT-2) adopting the Transformer for language modeling. However, this result does not appear to have been well disseminated and was even rediscovered recently. This may be partially due to a sudden growth of the language modeling community after the advent of GPT-2, but perhaps also due to the lack of a clear explanation in prior publications, despite being commonly understood by practitioners in the past. Here we review this long-forgotten explanation why explicit PEs are nonessential for multi-layer autoregressive Transformers (in contrast, one-layer models require PEs to discern order information of their input tokens). We also review the origin of this result, and hope to re-establish it as a common knowledge.

language model, proc, transformer, (12 more...)

arXiv.org Artificial Intelligence

2501.00659

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(19 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural networks that overcome classic challenges through practice

Irie, Kazuki, Lake, Brenden M.

arXiv.org Artificial IntelligenceDec-15-2024

Since the earliest proposals for neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared to human cognitive abilities. Here we review recent work that uses metalearning to overcome several classic challenges by addressing the Problem of Incentive and Practice -- that is, providing machines with both incentives to improve specific skills and opportunities to practice those skills. This explicit optimization contrasts with more conventional approaches that hope the desired behavior will emerge through optimizing related but different objectives. We review applications of this principle to addressing four classic challenges for neural networks: systematic generalization, catastrophic forgetting, few-shot learning and multi-step reasoning. We also discuss the prospects for understanding aspects of human development through this framework, and whether natural environments provide the right incentives and practice for learning how to make challenging generalizations.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

2410.10596

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > Austria > Vienna (0.14)
(23 more...)

Genre: Overview (0.86)

Industry:

Education > Educational Setting (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Boundless Socratic Learning with Language Games

Schaul, Tom

arXiv.org Artificial IntelligenceNov-25-2024

An agent trained within a closed system can master any desired capability, as long as the following three conditions hold: (a) it receives sufficiently informative and aligned feedback, (b) its coverage of experience/data is broad enough, and (c) it has sufficient capacity and resource. In this position paper, we justify these conditions, and consider what limitations arise from (a) and (b) in closed systems, when assuming that (c) is not a bottleneck. Considering the special case of agents with matching input and output spaces (namely, language), we argue that such pure recursive self-improvement, dubbed "Socratic learning", can boost performance vastly beyond what is present in its initial data or knowledge, and is only limited by time, as well as gradual misalignment concerns. Furthermore, we propose a constructive framework to implement it, based on the notion of language games.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.16905

Country: