sejnowski
Transformers and Cortical Waves: Encoders for Pulling In Context Across Time
Muller, Lyle, Churchland, Patricia S., Sejnowski, Terrence J.
The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence into a long "encoding vector" - that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, "self-attention" applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity, traveling across single cortical regions or across multiple regions at the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable temporal context to be extracted from sequences of sensory inputs, the same computational principle used in transformers.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
What we learned from the deep learning revolution - TechTalks
Today, deep learning is the talk of the town. There is no shortage of media coverage, papers, books, and events on deep learning. Yet deep learning is not new. Its roots go back almost to the early days of artificial intelligence and computing. While the field received the cold shoulder for decades, there were a few scientists and researchers who plodded forward, keeping faith that the idea of artificial neural networks would one day bear fruit. And we are seeing the fruits of deep learning in everyday applications, such as search, chat, email, social media, and online shopping.
Storing Covariance by the Associative Long-Term Potentiation and Depression of Synaptic Strengths in the Hippocampus
In modeling studies or memory based on neural networks, both the selective enhancement and depression or synaptic strengths are required ror effident storage or inrormation (Sejnowski, 1977a,b; Kohonen, 1984; Bienenstock et aI, 1982; Sejnowski and Tesauro, 1989). We have tested this assumption in the hippocampus, a cortical structure or the brain that is involved in long-term memory. A brier, high-frequency activation or excitatory synapses in the hippocampus produces an increase in synaptic strength known as long-term potentiation, or L TP (BUss and Lomo, 1973), that can last ror many days. LTP is known to be Hebbian since it requires the simultaneous release or neurotransmitter from presynaptic terminals coupled with postsynaptic depolarization (Kelso et al, 1986; Malinow and Miller, 1986; Gustatrson et al, 1987). However, a mechanism ror the persistent reduction or synaptic strength that could balance LTP has not yet been demonstrated.
Reinforcement Learning Predicts the Site of Plasticity for Auditory Remapping in the Barn Owl
The auditory system of the barn owl contains several spatial maps. In young barn owls raised with optical prisms over their eyes, these auditory maps are shifted to stay in register with the visual map, suggesting that the visual input imposes a frame of reference on the auditory maps. However, the optic tectum, the first site of convergence of visual with auditory information, is not the site of plasticity for the shift of the auditory maps; the plasticity occurs instead in the inferior colliculus, which contains an auditory map and projects into the optic tectum. We explored a model of the owl remapping in which a global reinforcement signal whose delivery is controlled by visual foveation. A hebb learning rule gated by rein(cid:173) forcement learned to appropriately adjust auditory maps. In addi(cid:173) tion, reinforcement learning preferentially adjusted the weights in the inferior colliculus, as in the owl brain, even though the weights were allowed to change throughout the auditory system.
A Non-linear Information Maximisation Algorithm that Performs Blind Separation
A new learning algorithm is derived which performs online stochas(cid:173) tic gradient ascent in the mutual information between outputs and inputs of a network. In the absence of a priori knowledge about the'signal' and'noise' components of the input, propagation of information depends on calibrating network non-linearities to the detailed higher-order moments of the input density functions. As an example application, we have achieved near-perfect separation of ten digi(cid:173) tally mixed speech signals. Our simulations lead us to believe that our network performs better at blind separation than the Herault(cid:173) J utten network, reflecting the fact that it is derived rigorously from the mutual information objective.
Spatial Representations in the Parietal Cortex May Use Basis Functions
The parietal cortex is thought to represent the egocentric posi(cid:173) tions of objects in particular coordinate systems. We propose an alternative approach to spatial perception of objects in the pari(cid:173) etal cortex from the perspective of sensorimotor transformations. The responses of single parietal neurons can be modeled as a gaus(cid:173) sian function of retinal position multiplied by a sigmoid function of eye position, which form a set of basis functions. We show here how these basis functions can be used to generate receptive fields in either retinotopic or head-centered coordinates by simple linear transformations. This raises the possibility that the parietal cortex does not attempt to compute the positions of objects in a partic(cid:173) ular frame of reference but instead computes a general purpose representation of the retinal location and eye position from which any transformation can be synthesized by direct projection.
Learning Nonlinear Overcomplete Representations for Efficient Coding
We derive a learning algorithm for inferring an overcomplete basis by viewing it as probabilistic model of the observed data. Over(cid:173) complete bases allow for better approximation of the underlying statistical density. Using a Laplacian prior on the basis coefficients removes redundancy and leads to representations that are sparse and are a nonlinear function of the data. This can be viewed as a generalization of the technique of independent component anal(cid:173) ysis and provides a method for blind source separation of fewer mixtures than sources. We demonstrate the utility of overcom(cid:173) plete representations on natural speech and show that compared to the traditional Fourier basis the inferred representations poten(cid:173) tially have much greater coding efficiency.
Large Language Models and the Reverse Turing Test
Large Language Models (LLMs) have been transformative. They are pre-trained foundational models that are self-supervised and can be adapted with fine tuning to a wide range of natural language tasks, each of which previously would have required a separate network model. This is one step closer to the extraordinary versatility of human language. GPT-3 and more recently LaMDA can carry on dialogs with humans on many topics after minimal priming with a few examples. However, there has been a wide range of reactions and debate on whether these LLMs understand what they are saying or exhibit signs of intelligence. This high variance is exhibited in three interviews with LLMs reaching wildly different conclusions. A new possibility was uncovered that could explain this divergence. What appears to be intelligence in LLMs may in fact be a mirror that reflects the intelligence of the interviewer, a remarkable twist that could be considered a Reverse Turing Test. If so, then by studying interviews we may be learning more about the intelligence and beliefs of the interviewer than the intelligence of the LLMs. As LLMs become more capable they may transform the way we interact with machines and how they interact with each other. Increasingly, LLMs are being coupled with sensorimotor devices. LLMs can talk the talk, but can they walk the walk? A road map for achieving artificial general autonomy is outlined with seven major improvements inspired by brain systems. LLMs could be used to uncover new insights into brain function by downloading brain data during natural behaviors.
- Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.05)
- Africa > Middle East > Egypt (0.05)
- Atlantic Ocean > North Atlantic Ocean > English Channel (0.04)
- (7 more...)
- Leisure & Entertainment > Games (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education > Educational Setting (1.00)
- Media (0.93)
Solving Royal Game of Ur Using Reinforcement Learning
Malhotra, Sidharth, Malik, Girik
Reinforcement Learning has recently surfaced as a very powerful tool to solve complex problems in the domain of board games, wherein an agent is generally required to learn complex strategies and moves based on its own experiences and rewards received. While RL has outperformed existing state-of-the-art methods used for playing simple video games and popular board games, it is yet to demonstrate its capability on ancient games. Here, we solve one such problem, where we train our agents using different methods namely Monte Carlo, Qlearning and Expected Sarsa to learn optimal policy to play the strategic Royal Game of Ur. The state space for our game is complex and large, but our agents show promising results at playing the game and learning important strategic moves. Although it is hard to conclude that when trained with limited resources which algorithm performs better overall, but Expected Sarsa shows promising results when it comes to fastest learning.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > India (0.04)
What is AI? Stephen Hanson in conversation with Terry Sejnowski
Hanson: Terry, thanks so much for joining this videocast or podvideo, I don't really know what to call it. When I started trying to conceptualize what I was getting at, I wanted to talk to people who had a clear and obvious perspective on what they thought AI is. And you're particularly unique, and special in this context, because you have been consistent since… Well, there's a great book that you have a chapter in and I think Jim Anderson edited in 1981, called "Parallel Models of Associative Memory". Sejnowski: It's interesting you brought that up because I met Geoff Hinton in San Diego in 1979 at a workshop he and Jim organized that resulted in that book. It was my first neural network workshop. We were all interested the same things. There was no neural network organization or community at that time – We were a bunch of isolated researchers working on our own. Hanson: And probably not well appreciated, by talking about neural networks, or neural modelling. Sejnowski: We were the outliers. But we had a great time talking with each other. Hanson: Going back to the book, you had a chapter called skeleton filters in the brain. I think that was the name of it. Perhaps not the best title in the world, but still… "Skeleton filters" is a little scary, I gotta say. But, it was a really incredibly easy read – I just read it the other day again. And, in it, you're really going in a subtle way from biophysics, modelling a neuron and referencing everybody, you know Cowen, and everybody who'd developed a differential equation, or anything up to semantics and cognition. But biophysical modeling, this kind of category you might associate with biophysics of neural modelling, in that neurons and circuits matter and that's what we're modelling, for that purpose – that's the purpose of it. For example, I think you mentioned Hartline and Ratliff, and Limulus crab retina. And this provided an enormous amount of data well into the 60s where people were actually modelling and there were predictions and it was very tightly tied to the crab. Sejnowski: By the way, although it's called a Horseshoe Crab, and looks like one, Limulus has eight legs, so it's an arachnid.
- North America > United States > California > San Diego County > San Diego (0.24)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)