AITopics | Russin, Jacob

Collaborating Authors

Russin, Jacob

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks

Russin, Jacob, McGrath, Sam Whitman, Williams, Danielle J., Elber-Dorozko, Lotem

arXiv.org Artificial IntelligenceMay-23-2024

Compositionality has long been considered a key explanatory property underlying human intelligence: arbitrary concepts can be composed into novel complex combinations, permitting the acquisition of an open ended, potentially infinite expressive capacity from finite learning experiences. Influential arguments have held that neural networks fail to explain this aspect of behavior, leading many to dismiss them as viable models of human cognition. Over the last decade, however, modern deep neural networks (DNNs), which share the same fundamental design principles as their predecessors, have come to dominate artificial intelligence, exhibiting the most advanced cognitive behaviors ever demonstrated in machines. In particular, large language models (LLMs), DNNs trained to predict the next word on a large corpus of text, have proven capable of sophisticated behaviors such as writing syntactically complex sentences without grammatical errors, producing cogent chains of reasoning, and even writing original computer programs -- all behaviors thought to require compositional processing. In this chapter, we survey recent empirical work from machine learning for a broad audience in philosophy, cognitive science, and neuroscience, situating recent breakthroughs within the broader context of philosophical arguments about compositionality. In particular, our review emphasizes two approaches to endowing neural networks with compositional generalization capabilities: (1) architectural inductive biases, and (2) metalearning, or learning to learn. We also present findings suggesting that LLM pretraining can be understood as a kind of metalearning, and can thereby equip DNNs with compositional generalization abilities in a similar way. We conclude by discussing the implications that these findings may have for the study of compositionality in human cognition and by suggesting avenues for future research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.15164

Country:

Europe (1.00)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)
Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multiple Realizability and the Rise of Deep Learning

McGrath, Sam Whitman, Russin, Jacob

arXiv.org Artificial IntelligenceMay-21-2024

The multiple realizability thesis holds that psychological states may be implemented in a diversity of physical systems. The deep learning revolution seems to be bringing this possibility to life, offering the most plausible examples of man-made realizations of sophisticated cognitive functions to date. This paper explores the implications of deep learning models for the multiple realizability thesis. Among other things, it challenges the widely held view that multiple realizability entails that the study of the mind can and must be pursued independently of the study of its implementation in the brain or in artificial analogues. Although its central contribution is philosophical, the paper has substantial methodological upshots for contemporary cognitive science, suggesting that deep neural networks may play a crucial role in formulating and evaluating hypotheses about cognition, even if they are interpreted as implementation-level models. In the age of deep learning, multiple realizability possesses a renewed significance.

artificial intelligence, machine learning, realization, (17 more...)

arXiv.org Artificial Intelligence

2405.13231

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Human Curriculum Effects Emerge with In-Context Learning in Neural Networks

Russin, Jacob, Pavlick, Ellie, Frank, Michael J.

arXiv.org Artificial IntelligenceFeb-13-2024

Human learning is sensitive to rule-like structure and the curriculum of examples used for training. In tasks governed by succinct rules, learning is more robust when related examples are blocked across trials, but in the absence of such rules, interleaving is more effective. To date, no neural model has simultaneously captured these seemingly contradictory effects. Here we show that this same tradeoff spontaneously emerges with "in-context learning" (ICL) both in neural networks trained with metalearning and in large language models (LLMs). ICL is the ability to learn new tasks "in context" - without weight changes - via an inner-loop algorithm implemented in activation dynamics. Experiments with pretrained LLMs and metalearning transformers show that ICL exhibits the blocking advantage demonstrated in humans on a task involving rule-like structure, and conversely, that concurrent in-weight learning reproduces the interleaving advantage observed in humans on tasks lacking such structure.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.08674

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

Webb, Taylor W., Frankland, Steven M., Altabaa, Awni, Krishnamurthy, Kamesh, Campbell, Declan, Russin, Jacob, O'Reilly, Randall, Lafferty, John, Cohen, Jonathan D.

arXiv.org Artificial IntelligenceJan-19-2024

A central challenge for cognitive science is to explain how abstract concepts are acquired from limited experience. This effort has often been framed in terms of a dichotomy between connectionist and symbolic cognitive models. Here, we highlight a recently emerging line of work that suggests a novel reconciliation of these approaches, by exploiting an inductive bias that we term the relational bottleneck. We review a family of models that employ this approach to induce abstractions in a data-efficient manner, emphasizing their potential as candidate models for the acquisition of abstract concepts in the human mind and brain.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2309.06629

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.67)

Add feedback

Compositional Processing Emerges in Neural Networks Solving Math Problems

Russin, Jacob, Fernandez, Roland, Palangi, Hamid, Rosen, Eric, Jojic, Nebojsa, Smolensky, Paul, Gao, Jianfeng

arXiv.org Artificial IntelligenceMay-19-2021

A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations. We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings (e.g., the quantities corresponding to numerals) should be composed according to structured rules (e.g., order of operations). Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.

neural network, survey article, vector, (20 more...)

arXiv.org Artificial Intelligence

2105.08961

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback