AITopics

2309.14691

Country:

North America > United States (0.14)
Oceania > Australia (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceAug-15-2023

Brain-Inspired Computational Intelligence via Predictive Coding

Salvatori, Tommaso, Mali, Ankur, Buckley, Christopher L., Lukasiewicz, Thomas, Rao, Rajesh P. N., Friston, Karl, Ororbia, Alexander

Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

2308.0787

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > Washington > King County > Seattle (0.27)
North America > United States > Florida (0.27)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Artificial IntelligenceApr-2-2023

The Predictive Forward-Forward Algorithm

Ororbia, Alexander, Mali, Ankur

The algorithm known as backpropagation of errors [59, 32], or "backprop" for short, has long faced criticism concerning its neurobiological plausibility [10, 14, 56, 35, 15]. Despite powering the tremendous progress and success behind deep learning and its every-growing myriad of promising applications [57, 12], it is improbable that backprop is a viable model of learning in the brain, such as in cortical regions. Notably, there are both practical and biophysical issues [15, 35], and, among these issues, there is a lack of evidence that: 1) neural activities are explicitly stored to be used later for synaptic adjustment, 2) error derivatives are backpropagated along a global feedback pathway to generate teaching signals, 3) the error signals move back along the same neural pathways used to forward propagate information, and, 4) inference and learning are locked to be largely sequential (instead of massively parallel). Furthermore, when processing temporal data, it is certainly not the case that the neural circuitry of the brain is unfolded backward through time to adjust synapses [42] (as in backprop through time). Recently, there has been a growing interest in the research domain of brain-inspired computing, which focuses on developing algorithms and computational models that attempt to circumvent or resolve critical issues such as those highlighted above. Among the most powerful and promising ones is predictive coding (PC) [18, 48, 13, 4, 51, 41], and among the most recent ones is the forward-forward (FF) algorithm [19]. These alternatives offer different means of conducting credit assignments with performance similar to backprop, but to the contrary, are more likely consistent with and similar to real biological neuron learning (see Figure 1 for a graphical depiction and comparison of respective credit assignment setups). This paper will propose a novel model and learning process, the predictive forward-forward (PFF) process, that generalizes and combines FF and PC into a robust stochastic neural system that simultaneously learns a representation and generative model in a biologically-plausible fashion. Like the FF algorithm, the PFF procedure offers a promising, potentially helpful model of biological neural circuits, a potential candidate system for low-power analog hardware and neuromorphic circuits, and a potential backprop-alternative worthy of future investigation and study.

algorithm, artificial intelligence, machine learning, (16 more...)

2301.01452

Country:

North America > Canada (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-13-2023

Like a bilingual baby: The advantage of visually grounding a bilingual language model

Nguyen, Khai-Nguyen, Tang, Zixin, Mali, Ankur, Kelly, Alex

Unlike most neural language models, humans learn language in a rich, multi-sensory and, often, multi-lingual environment. Current language models typically fail to fully capture the complexities of multilingual language use. We train an LSTM language model on images and captions in English and Spanish from MS-COCO-ES. We find that the visual grounding improves the model's understanding of semantic similarity both within and across languages and improves perplexity. However, we find no significant advantage of visual grounding for abstract words. Our results provide additional evidence of the advantages of visually grounded language models and point to the need for more naturalistic language data from multilingual speakers and multilingual datasets with perceptual grounding.

artificial intelligence, machine learning, natural language, (18 more...)

2210.05487

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

arXiv.org Artificial IntelligenceFeb-5-2023

Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images

Ororbia, Alexander, Mali, Ankur

The algorithm known as backpropagation of errors [65, 29] (or backprop) has served as a crucial element behind the tremendous progress that has been made in recent machine learning research, progress which has been accelerated by advances made in computational hardware as well as the increasing availability of vast quantities of data. Nevertheless, despite reaching or surpassing human-level performance on many different tasks ranging from those in computer vision [18] to game-playing [60], the field still has a long way to go towards developing artificial general intelligence. In order to increase task-level performance, the size of deep networks has increased greatly over the years, up to hundreds of billions of synaptic parameters as seen in modern-day transformer networks [12]. However, this trend has started to raise concerns related to energy consumption [49] and as to whether such large systems can attain the flexible, generalization ability of the human brain [5]. Furthermore, backprop itself imposes additional limitations beyond its long-argued biological implausibility [11, 15, 59], such as its dependence on a global error feedback pathway for determining each neuron's individual contribution to a deep network's overall performance [34], resulting in sequential backward, non-local updates that make parallelization difficult (which stands in strong contrast to how learning occurs in the brain [24, 47, 46]).

artificial intelligence, deep learning, machine learning, (18 more...)

2211.12047

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningJun-10-2020

Provably Stable Interpretable Encodings of Context Free Grammars in RNNs with a Differentiable Stack

Stogin, John, Mali, Ankur, Giles, C Lee

Given a collection of strings belonging to a context free grammar (CFG) and another collection of strings not belonging to the CFG, how might one infer the grammar? This is the problem of grammatical inference. Since CFGs are the languages recognized by pushdown automata (PDA), it suffices to determine the state transition rules and stack action rules of the corresponding PDA. An approach would be to train a recurrent neural network (RNN) to classify the sample data and attempt to extract these PDA rules. But neural networks are not a priori aware of the structure of a PDA and would likely require many samples to infer this structure. Furthermore, extracting the PDA rules from the RNN is nontrivial. We build a RNN specifically structured like a PDA, where weights correspond directly to the PDA rules. This requires a stack architecture that is somehow differentiable (to enable gradient-based learning) and stable (an unstable stack will show deteriorating performance with longer strings). We propose a stack architecture that is differentiable and that provably exhibits orbital stability. Using this stack, we construct a neural network that provably approximates a PDA for strings of arbitrary length. Moreover, our model and method of proof can easily be generalized to other state machines, such as a Turing Machine.

artificial intelligence, deep learning, neural network, (18 more...)

2006.03651

Country:

North America > United States > Oregon (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.40)

Industry: Law (0.74)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

arXiv.org Machine LearningMay-25-2019

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively

Ororbia, Alexander, Mali, Ankur, Kifer, Daniel, Giles, C. Lee

In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we present a new connectionist model, the Sequential Neural Coding Network, and its learning procedure, grounded in the neurocognitive theory of predictive coding. The architecture experiences significantly less forgetting as compared to standard neural models and outperforms a variety of previously proposed remedies and methods when trained across multiple task datasets in a stream-like fashion. The promising performance demonstrated in our experiments offers motivation that directly incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, can create viable pathways for tackling the challenge of lifelong machine learning.

neural network, neurology, representation, (18 more...)

1905.10696

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.70)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningSep-6-2018

Biologically Motivated Algorithms for Propagating Local Target Representations

Ororbia, Alexander G., Mali, Ankur

Finding biologically plausible alternatives to back-propagation of errors is a fundamentally important challenge in artificial neural network research. In this paper, we propose a simple learning algorithm called error-driven Local Representation Alignment (LRA-E), which has strong connections to predictive coding, a theory that offers a mechanistic way of describing neurocomputational machinery. In addition, we propose an improved variant of Difference Target Propagation, another procedure that comes from the same family of algorithms as Local Representation Alignment. We compare our learning procedures to several other biologically-motivated algorithms, including two feedback alignment algorithms and Equilibrium Propagation. In two benchmark datasets, we find that both of our proposed learning algorithms yield stable performance and strong generalization abilities in comparison to other competing back-propagation alternatives when training deeper, highly nonlinear networks, with LRA-E performing the best overall.

algorithm, deep learning, neural network, (17 more...)

1805.11703

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-29-2018

Visually Grounded, Situated Learning in Neural Models

Ororbia, Alexander G., Mali, Ankur, Kelly, Matthew A., Reitter, David

The theory of situated cognition postulates that language is inseparable from its physical context--words, phrases, and sentences must be learned in the context of the objects or concepts to which they refer. Yet, statistical language models are trained on words alone. This makes it impossible for language models to connect to the real world--the world described in the sentences presented to the model. In this paper, we examine the generalization ability of neural language models trained with a visual context. A multimodal connectionist language architecture based on the Differential State Framework is proposed, which outperforms its equivalent trained on language alone, even when no visual context is available at test time. Superior performance for language models trained with a visual context is robust across different languages and models.

deep learning, language model, neural network, (19 more...)

1805.11546

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.40)

Industry:

Transportation (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Machine LearningMar-5-2018

Conducting Credit Assignment by Aligning Local Representations

Ororbia, Alexander G., Mali, Ankur, Kifer, Daniel, Giles, C. Lee

The use of back-propagation and its variants to train deep networks is often problematic for new users, with issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often making networks difficult to train. In this paper, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that are more biologically plausible. Experiments on MNIST and Fashion MNIST validate the performance of the algorithm and show that LRA can train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment.

deep learning, neural network, subgraph, (17 more...)

1803.01834

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)