AITopics | Sejnowski, Terrence J.

Collaborating Authors

Sejnowski, Terrence J.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Transformers and Cortical Waves: Encoders for Pulling In Context Across Time

Muller, Lyle, Churchland, Patricia S., Sejnowski, Terrence J.

arXiv.org Artificial IntelligenceJan-25-2024

The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence into a long "encoding vector" - that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, "self-attention" applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity, traveling across single cortical regions or across multiple regions at the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable temporal context to be extracted from sequences of sensory inputs, the same computational principle used in transformers.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2401.14267

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)

Add feedback

Replay in Deep Learning: Current Approaches and Missing Biological Elements

Hayes, Tyler L., Krishnan, Giri P., Bazhenov, Maxim, Siegelmann, Hava T., Sejnowski, Terrence J., Kanan, Christopher

arXiv.org Artificial IntelligenceApr-1-2021

Replay is the reactivation of one or more neural patterns, which are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated into deep artificial neural networks that learn over time to avoid catastrophic forgetting of previous knowledge. Replay algorithms have been successfully used in a wide range of deep learning methods within supervised, unsupervised, and reinforcement learning paradigms. In this paper, we provide the first comprehensive comparison between replay in the mammalian brain and replay in artificial neural networks. We identify multiple aspects of biological replay that are missing in deep learning systems and hypothesize how they could be utilized to improve artificial neural networks.

deep learning, neural network, replay, (19 more...)

arXiv.org Artificial Intelligence

2104.04132

Country:

North America > United States > New York (0.28)
North America > United States > California > San Diego County (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Sleep (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing

Siddharth, Siddharth, Jung, Tzyy-Ping, Sejnowski, Terrence J.

arXiv.org Machine LearningMay-16-2019

In recent years, the use of bio-sensing signals such as electroencephalogram (EEG), electrocardiogram (ECG), etc. have garnered interest towards applications in affective computing. The parallel trend of deep-learning has led to a huge leap in performance towards solving various vision-based research problems such as object detection. Yet, these advances in deep-learning have not adequately translated into bio-sensing research. This work applies novel deep-learning-based methods to various bio-sensing and video data of four publicly available multi-modal emotion datasets. For each dataset, we first individually evaluate the emotion-classification performance obtained by each modality. We then evaluate the performance obtained by fusing the features from these modalities. We show that our algorithms outperform the results reported by other studies for emotion/valence/arousal/liking classification on DEAP and MAHNOB-HCI datasets and set up benchmarks for the newer AMIGOS and DREAMER datasets. We also evaluate the performance of our algorithms by combining the datasets and by using transfer learning to show that the proposed method overcomes the inconsistencies between the datasets. Hence, we do a thorough analysis of multi-modal affective data from more than 120 subjects and 2,800 trials. Finally, utilizing a convolution-deconvolution network, we propose a new technique towards identifying salient brain regions corresponding to various affective states.

dataset, deep learning, vascular disease, (25 more...)

arXiv.org Machine Learning

1905.07039

Country:

Asia (0.67)
North America > United States > California > San Diego County (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2018

Most large-scale network models use neurons with static nonlinearities that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking neural networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking dynamics and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~ millisecond) spike-based interactions for efficient encoding of information, and a delayed-memory task over extended duration (~ second). The results show that the gradient descent approach indeed optimizes networks dynamics on the time scale of individual spikes as well as on behavioral time scales. In conclusion, our method yields a general purpose supervised learning algorithm for spiking neural networks, which can facilitate further investigations on spike-based computations.

artificial intelligence, machine learning, neuron, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2018

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

arXiv.org Machine LearningJun-19-2017

Much of studies on neural computation are based on network models of static neurons that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking networks and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~millisecond) spike-based interactions for efficient encoding of information, and a delayed memory XOR task over extended duration (~second). The results show that our method indeed optimizes the spiking network dynamics on the time scale of individual spikes as well as behavioral time scales. In conclusion, our result offers a general purpose supervised learning algorithm for spiking neural networks, thus advancing further investigations on spike-based computation.

neural network, neurology, neuron, (18 more...)

arXiv.org Machine Learning

1706.04698

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

The Wilson Machine for Image Modeling

Saremi, Saeed, Sejnowski, Terrence J.

arXiv.org Machine LearningNov-11-2015

Learning the distribution of natural images is one of the hardest and most important problems in machine learning. The problem remains open, because the enormous complexity of the structures in natural images spans all length scales. We break down the complexity of the problem and show that the hierarchy of structures in natural images fuels a new class of learning algorithms based on the theory of critical phenomena and stochastic processes. We approach this problem from the perspective of the theory of critical phenomena, which was developed in condensed matter physics to address problems with infinite length-scale fluctuations, and build a framework to integrate the criticality of natural images into a learning algorithm. The problem is broken down by mapping images into a hierarchy of binary images, called bitplanes. In this representation, the top bitplane is critical, having fluctuations in structures over a vast range of scales. The bitplanes below go through a gradual stochastic heating process to disorder. We turn this representation into a directed probabilistic graphical model, transforming the learning problem into the unsupervised learning of the distribution of the critical bitplane and the supervised learning of the conditional distributions for the remaining bitplanes. We learnt the conditional distributions by logistic regression in a convolutional architecture. Conditioned on the critical binary image, this simple architecture can generate large, natural-looking images, with many shades of gray, without the use of hidden units, unprecedented in the studies of natural images. The framework presented here is a major step in bringing criticality and stochastic processes to machine learning and in studying natural image statistics.

deep learning, neural network, representation, (18 more...)

arXiv.org Machine Learning

1510.0774

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Bayesian Framework for Tilt Perception and Confidence

Schwartz, Odelia, Dayan, Peter, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2006

The misjudgement of tilt in images lies at the heart of entertaining visual illusionsand rigorous perceptual psychophysics.

bayesian inference, flanker, neurology, (21 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Assignment of Multiplicative Mixtures in Natural Images

Schwartz, Odelia, Sejnowski, Terrence J., Dayan, Peter

Neural Information Processing SystemsDec-31-2005

GSMs pose a critical assignment problem, working out which filter responses were generated by a common multiplicative factor.

artificial intelligence, bayesian inference, mixer variable, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.15)
Europe > Greece (0.14)
Asia > Japan (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)

Add feedback

Handling Missing Data with Variational Bayesian Learning of ICA

Chan, Kwokleung, Lee, Te-Won, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2003

Missing data is common in real-world datasets and is a problem for many estimation techniques. We have developed a variational Bayesian method to perform Independent Component Analysis (ICA) on high-dimensional data containing missing entries. Missing data are handled naturally in the Bayesian framework by integrating the generative density model. Modeling the distributions of the independent sources with mixture of Gaussians allows sources to be estimated with different kurtosis and skewness. The variational Bayesian method automatically determines the dimensionality of the data and yields an accurate density model for the observed data without overfitting problems. This allows direct probability estimation of missing values in the high dimensional space and avoids dimension reduction preprocessing which is not feasible with missing data.

bayesian inference, health & medicine, ica, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback