Collaborating Authors


The Promise & Peril of Brain Machine Interfaces, with Ricardo Chavarriaga


ANJA KASPERSEN: Today's podcast will focus on artificial intelligence (AI), neuroscience, and neurotechnologies. My guest today is Ricardo Chavarriaga. Ricardo is an electrical engineer and a doctor of computational neuroscience. He is currently the head of the Swiss office of the Confederation of Laboratories for AI Research in Europe (CLAIRE) and a senior researcher at Zurich University of Applied Sciences. Ricardo, it is an honor and a delight to share the virtual stage with you today. I am really happy and looking forward to a nice discussion today. ANJA KASPERSEN: Neuroscience is a vast and fast-developing field. Maybe you could start by providing our listeners with some background. When we think about the brain, this is something that has fascinated humanity for a long time. The question of how this organ that we have inside our heads can rule our behavior and can store and develop knowledge has been indeed one of the questions for science for many, many years. Neurotechnologies, computational neuroscience, and brain-machine interfaces are tools that we have developed to approach the understanding of this fabulous organ. When we talk about computational neuroscience it is the use of computational tools to create models of the brain. It can be mathematical models, it can be algorithms that try to reproduce our observations about the brain. It can be experiments on humans and on animals: these experiments can be behavioral, they can involve measurements of brain activity, and by looking at how the brains of organisms react and how the activity changes we will then try to apply our knowledge to create models for that. These models can have different flavors. We can for instance have very detailed models of electrochemical processes inside a neuron, and then we are looking at just a small part of the brain. We can have large-scale models with fewer details of how different brain structures interact among themselves, or even less-detailed models that try to reproduce behavior that we observe in animals and in humans as a result of certain mental disorders. We can even test these models using probes to tap into how can our brain construct representations of the world based on images, based on tactile, and based on auditory information.

A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence - Nature Neuroscience


Extensive sampling of neural activity during rich cognitive phenomena is critical for robust understanding of brain function. Here we present the Natural Scenes Dataset (NSD), in which high-resolution functional magnetic resonance imaging responses to tens of thousands of richly annotated natural scenes were measured while participants performed a continuous recognition task. To optimize data quality, we developed and applied novel estimation and denoising techniques. Simple visual inspections of the NSD data reveal clear representational transformations along the ventral visual pathway. Further exemplifying the inferential power of the dataset, we used NSD to build and train deep neural network models that predict brain activity more accurately than state-of-the-art models from computer vision. NSD also includes substantial resting-state and diffusion data, enabling network neuroscience perspectives to constrain and enhance models of perception and memory. Given its unprecedented scale, quality and breadth, NSD opens new avenues of inquiry in cognitive neuroscience and artificial intelligence. The authors measured high-resolution fMRI activity from eight individuals who saw and memorized thousands of annotated natural images over 1 year. This massive dataset enables new paths of inquiry in cognitive neuroscience and artificial intelligence.

A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part II: Applications, Cognitive Models, and Challenges Artificial Intelligence

This is Part II of the two-part comprehensive survey devoted to a computing framework most commonly known under the names Hyperdimensional Computing and Vector Symbolic Architectures (HDC/VSA). Both names refer to a family of computational models that use high-dimensional distributed representations and rely on the algebraic properties of their key operations to incorporate the advantages of structured symbolic representations and vector distributed representations. Holographic Reduced Representations is an influential HDC/VSA model that is well-known in the machine learning domain and often used to refer to the whole family. However, for the sake of consistency, we use HDC/VSA to refer to the area. Part I of this survey covered foundational aspects of the area, such as historical context leading to the development of HDC/VSA, key elements of any HDC/VSA model, known HDC/VSA models, and transforming input data of various types into high-dimensional vectors suitable for HDC/VSA. This second part surveys existing applications, the role of HDC/VSA in cognitive computing and architectures, as well as directions for future work. Most of the applications lie within the machine learning/artificial intelligence domain, however we also cover other applications to provide a thorough picture. The survey is written to be useful for both newcomers and practitioners.

Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning Artificial Intelligence

In one In meta-learning, networks are trained with external method, the "inner loop" stores information in the algorithms to learn tasks that require acquiring, time-varying activities of a recurrent network, which storing and exploiting unpredictable information for is slowly optimized in the "outer loop" over many each new instance of the task. However, animals are episodes [Hochreiter et al., 2001, Wang et al., 2016, able to pick up such cognitive tasks automatically, Duan et al., 2016]. A biological interpretation of as a result of their evolved neural architecture and this method is that the inner loop represents the synaptic plasticity mechanisms. Here we evolve neural within-episode self-sustaining activity of cerebral cortex, networks, endowed with plastic connections, over while the outer loop represents lifetime sculpting a sizeable set of simple meta-learning tasks based on of neural connections by value-based neural plasticity a framework from computational neuroscience. The (this interpretation is explored in detail by Wang resulting evolved network can automatically acquire et al. [2018]).

Towards Understanding Human Functional Brain Development with Explainable Artificial Intelligence: Challenges and Perspectives Artificial Intelligence

The last decades have seen significant advancements in non-invasive neuroimaging technologies that have been increasingly adopted to examine human brain development. However, these improvements have not necessarily been followed by more sophisticated data analysis measures that are able to explain the mechanisms underlying functional brain development. For example, the shift from univariate (single area in the brain) to multivariate (multiple areas in brain) analysis paradigms is of significance as it allows investigations into the interactions between different brain regions. However, despite the potential of multivariate analysis to shed light on the interactions between developing brain regions, artificial intelligence (AI) techniques applied render the analysis non-explainable. The purpose of this paper is to understand the extent to which current state-of-the-art AI techniques can inform functional brain development. In addition, a review of which AI techniques are more likely to explain their learning based on the processes of brain development as defined by developmental cognitive neuroscience (DCN) frameworks is also undertaken. This work also proposes that eXplainable AI (XAI) may provide viable methods to investigate functional brain development as hypothesised by DCN frameworks.

Confidence-Aware Subject-to-Subject Transfer Learning for Brain-Computer Interface Artificial Intelligence

The inter/intra-subject variability of electroencephalography (EEG) makes the practical use of the brain-computer interface (BCI) difficult. In general, the BCI system requires a calibration procedure to tune the model every time the system is used. This problem is recognized as a major obstacle to BCI, and to overcome it, approaches based on transfer learning (TL) have recently emerged. However, many BCI paradigms are limited in that they consist of a structure that shows labels first and then measures "imagery", the negative effects of source subjects containing data that do not contain control signals have been ignored in many cases of the subject-to-subject TL process. The main purpose of this paper is to propose a method of excluding subjects that are expected to have a negative impact on subject-to-subject TL training, which generally uses data from as many subjects as possible. In this paper, we proposed a BCI framework using only high-confidence subjects for TL training. In our framework, a deep neural network selects useful subjects for the TL process and excludes noisy subjects, using a co-teaching algorithm based on the small-loss trick. We experimented with leave-one-subject-out validation on two public datasets (2020 international BCI competition track 4 and OpenBMI dataset). Our experimental results showed that confidence-aware TL, which selects subjects with small loss instances, improves the generalization performance of BCI.

Long-range and hierarchical language predictions in brains and algorithms Artificial Intelligence

Deep learning has recently made remarkable progress in natural language processing. Yet, the resulting algorithms remain far from competing with the language abilities of the human brain. Predictive coding theory offers a potential explanation to this discrepancy: while deep language algorithms are optimized to predict adjacent words, the human brain would be tuned to make long-range and hierarchical predictions. To test this hypothesis, we analyze the fMRI brain signals of 304 subjects each listening to 70min of short stories. After confirming that the activations of deep language algorithms linearly map onto those of the brain, we show that enhancing these models with long-range forecast representations improves their brain-mapping. The results further reveal a hierarchy of predictions in the brain, whereby the fronto-parietal cortices forecast more abstract and more distant representations than the temporal cortices. Overall, this study strengthens predictive coding theory and suggests a critical role of long-range and hierarchical predictions in natural language processing.

From internal models toward metacognitive AI Artificial Intelligence

In several papers published in Biological Cybernetics in the 1980s and 1990s, Kawato and colleagues proposed computational models explaining how internal models are acquired in the cerebellum. These models were later supported by neurophysiological experiments using monkeys and neuroimaging experiments involving humans. These early studies influenced neuroscience from basic, sensory-motor control to higher cognitive functions. One of the most perplexing enigmas related to internal models is to understand the neural mechanisms that enable animals to learn large-dimensional problems with so few trials. Consciousness and metacognition -- the ability to monitor one's own thoughts, may be part of the solution to this enigma. Based on literature reviews of the past 20 years, here we propose a computational neuroscience model of metacognition. The model comprises a modular hierarchical reinforcement-learning architecture of parallel and layered, generative-inverse model pairs. In the prefrontal cortex, a distributed executive network called the "cognitive reality monitoring network" (CRMN) orchestrates conscious involvement of generative-inverse model pairs in perception and action. Based on mismatches between computations by generative and inverse models, as well as reward prediction errors, CRMN computes a "responsibility signal" that gates selection and learning of pairs in perception, action, and reinforcement learning. A high responsibility signal is given to the pairs that best capture the external world, that are competent in movements (small mismatch), and that are capable of reinforcement learning (small reward prediction error). CRMN selects pairs with higher responsibility signals as objects of metacognition, and consciousness is determined by the entropy of responsibility signals across all pairs.

Predictive Coding: a Theoretical and Experimental Review Artificial Intelligence

Predictive coding offers a potentially unifying account of cortical function -- postulating that the core function of the brain is to minimize prediction errors with respect to a generative model of the world. The theory is closely related to the Bayesian brain framework and, over the last two decades, has gained substantial influence in the fields of theoretical and cognitive neuroscience. A large body of research has arisen based on both empirically testing improved and extended theoretical and mathematical models of predictive coding, as well as in evaluating their potential biological plausibility for implementation in the brain and the concrete neurophysiological and psychological predictions made by the theory. Despite this enduring popularity, however, no comprehensive review of predictive coding theory, and especially of recent developments in this field, exists. Here, we provide a comprehensive review both of the core mathematical structure and logic of predictive coding, thus complementing recent tutorials in the literature. We also review a wide range of classic and recent work within the framework, ranging from the neurobiologically realistic microcircuits that could implement predictive coding, to the close relationship between predictive coding and the widely-used backpropagation of error algorithm, as well as surveying the close relationships between predictive coding and modern machine learning techniques.

Applications of the Free Energy Principle to Machine Learning and Neuroscience Artificial Intelligence

In this thesis, we explore and apply methods inspired by the free energy principle to two important areas in machine learning and neuroscience. The free energy principle is a general mathematical theory of the necessary information-theoretic behaviours of systems which maintain a separation from their environment. A core postulate of the theory is that complex systems can be seen as performing variational Bayesian inference and minimizing an information-theoretic quantity called the variational free energy. The free energy principle originated in, and has been extremely influential in theoretical neuroscience, having spawned a number of neurophysiologically realistic process theories, and maintaining close links with Bayesian Brain viewpoints. The thesis is split into three main parts where we apply methods and insights from the free energy principle to understand questions first in perception, then action, and finally learning. Specifically, in the first section, we focus on the theory of predictive coding, a neurobiologically plausible process theory derived from the free energy principle under certain assumptions, which argues that the primary function of the brain is to minimize prediction errors. We focus on scaling up predictive coding architectures and simulate large-scale predictive coding networks for perception on machine learning benchmarks; we investigate predictive coding's relationship to other classical filtering algorithms, and we demonstrate that many biologically implausible aspects of current models of predictive coding can be relaxed without unduly harming the performance of predictive coding models which allows for a potentially more literal translation of predictive coding theory into cortical microcircuits. In the second part of the thesis, we focus on the application of methods deriving from the free energy principle to action. We study the extension of methods of'active inference', a neurobiologically grounded account of action through variational message passing, to utilize deep artificial neural networks, allowing these methods to'scale up' to be competitive with state of the art deep reinforcement learning methods.