neural unit
SPINT: Spatial Permutation-Invariant Neural Transformer for Consistent Intracortical Motor Decoding
Intracortical Brain-Computer Interfaces (iBCI) decode behavior from neural population activity to restore motor functions and communication abilities in individuals with motor impairments. A central challenge for long-term iBCI deployment is the nonstationarity of neural recordings, where the composition and tuning profiles of the recorded populations are unstable across recording sessions. Existing approaches attempt to address this issue by explicit alignment techniques; however, they rely on fixed neural identities and require test-time labels or parameter updates, limiting their generalization across sessions and imposing additional computational burden during deployment. In this work, we address the problem of cross-session nonstationarity in long-term iBCI systems and introduce SPINT a Spatial Permutation-Invariant Neural Transformer framework for behavioral decoding that operates directly on unordered sets of neural units. Central to our approach is a novel context-dependent positional embedding scheme that dynamically infers unit-specific identities, enabling flexible generalization across recording sessions. SPINT supports inference on variable-size populations and allows fewshot, gradient-free adaptation using a small amount of unlabeled data from the test session. We evaluate SPINT on three multi-session datasets from the FALCON Benchmark, covering continuous motor decoding tasks in human and non-human primates. SPINT demonstrates robust cross-session generalization, outperforming existing zero-shot and few-shot unsupervised baselines while eliminating the need for test-time alignment and fine-tuning. Our work contributes an initial step toward a robust and scalable neural decoding framework for long-term iBCI applications.
SPINT: Spatial Permutation-Invariant Neural Transformer for Consistent Intracortical Motor Decoding
Le, Trung, Fang, Hao, Li, Jingyuan, Nguyen, Tung, Mi, Lu, Orsborn, Amy, Sรผmbรผl, Uygar, Shlizerman, Eli
Intracortical Brain-Computer Interfaces (iBCI) aim to decode behavior from neural population activity, enabling individuals with motor impairments to regain motor functions and communication abilities. A key challenge in long-term iBCI is the nonstationarity of neural recordings, where the composition and tuning profiles of the recorded populations are unstable across recording sessions. Existing methods attempt to address this issue by explicit alignment techniques; however, they rely on fixed neural identities and require test-time labels or parameter updates, limiting their generalization across sessions and imposing additional computational burden during deployment. In this work, we introduce SPINT - a Spatial Permutation-Invariant Neural Transformer framework for behavioral decoding that operates directly on unordered sets of neural units. Central to our approach is a novel context-dependent positional embedding scheme that dynamically infers unit-specific identities, enabling flexible generalization across recording sessions. SPINT supports inference on variable-size populations and allows few-shot, gradient-free adaptation using a small amount of unlabeled data from the test session. To further promote model robustness to population variability, we introduce dynamic channel dropout, a regularization method for iBCI that simulates shifts in population composition during training. We evaluate SPINT on three multi-session datasets from the FALCON Benchmark, covering continuous motor decoding tasks in human and non-human primates. SPINT demonstrates robust cross-session generalization, outperforming existing zero-shot and few-shot unsupervised baselines while eliminating the need for test-time alignment and fine-tuning. Our work contributes an initial step toward a robust and scalable neural decoding framework for long-term iBCI applications.
Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units
Dixit, Shrey, Fakhar, Kayson, Hadaeghi, Fatemeh, Mineault, Patrick, Kording, Konrad P., Hilgetag, Claus C.
Neural networks now generate text, images, and speech with billions of parameters, producing a need to know how each neural unit contributes to these high-dimensional outputs. Existing explainable-AI methods, such as SHAP, attribute importance to inputs, but cannot quantify the contributions of neural units across thousands of output pixels, tokens, or logits. Here we close that gap with Multiperturbation Shapley-value Analysis (MSA), a model-agnostic game-theoretic framework. By systematically lesioning combinations of units, MSA yields Shapley Modes, unit-wise contribution maps that share the exact dimensionality of the model's output. We apply MSA across scales, from multi-layer perceptrons to the 56-billion-parameter Mixtral-8x7B and Generative Adversarial Networks (GAN). The approach demonstrates how regularisation concentrates computation in a few hubs, exposes language-specific experts inside the LLM, and reveals an inverted pixel-generation hierarchy in GANs. Together, these results showcase MSA as a powerful approach for interpreting, editing, and compressing deep neural networks.
Can multivariate Granger causality detect directed connectivity of a multistable and dynamic biological decision network model?
Asadpour, Abdoreza, Wong-Lin, KongFatt
Extracting causal connections can advance interpretable AI and machine learning. Granger causality (GC) is a robust statistical method for estimating directed influences (DC) between signals. While GC has been widely applied to analysing neuronal signals in biological neural networks and other domains, its application to complex, nonlinear, and multistable neural networks is less explored. In this study, we applied time-domain multi-variate Granger causality (MVGC) to the time series neural activity of all nodes in a trained multistable biologically based decision neural network model with real-time decision uncertainty monitoring. Our analysis demonstrated that challenging two-choice decisions, where input signals could be closely matched, and the appropriate application of fine-grained sliding time windows, could readily reveal the original model's DC. Furthermore, the identified DC varied based on whether the network had correct or error decisions. Integrating the identified DC from different decision outcomes recovered most of the original model's architecture, despite some spurious and missing connectivity. This approach could be used as an initial exploration to enhance the interpretability and transparency of dynamic multistable and nonlinear biological or AI systems by revealing causal connections throughout different phases of neural network dynamics and outcomes.
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Li, Depeng, Wang, Tianqi, Chen, Junwei, Dai, Wei, Zeng, Zhigang
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones. In this paper, we propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL. In each training session, it introduces a supervisory mechanism to guide network expansion whose growth size is compactly commensurate with the intrinsic complexity of a newly arriving task. This constructs a near-minimal network while allowing the model to expand its capacity when cannot sufficiently hold new classes. At inference time, it automatically reactivates the required neural units to retrieve knowledge and leaves the remaining inactivated to prevent interference. We name our model AutoActivator, which is effective and scalable. To gain insights into the neural unit dynamics, we theoretically analyze the model's convergence property via a universal approximation theorem on learning sequential mappings, which is under-explored in the CIL community. Experiments show that our method achieves strong CIL performance in rehearsal-free and minimal-expansion settings with different backbones.
A multi-agent control framework for co-adaptation in brain-computer interfaces Roy Fox
In a closed-loop brain-computer interface (BCI), adaptive decoders are used to learn parameters suited to decoding the user's neural response. Feedback to the user provides information which permits the neural tuning to also adapt. We present an approach to model this process of co-adaptation between the encoding model of the neural signal and the decoding algorithm as a multi-agent formulation of the linear quadratic Gaussian (LQG) control problem. In simulation we characterize how decoding performance improves as the neural encoding and adaptive decoder optimize, qualitatively resembling experimentally demonstrated closed-loop improvement. We then propose a novel, modified decoder update rule which is aware of the fact that the encoder is also changing and show it can improve simulated co-adaptation dynamics. Our modeling approach offers promise for gaining insights into co-adaptation as well as improving user learning of BCI control in practical settings.
Artificial Neural Networks generated by Low Discrepancy Sequences
Keller, Alexander, Van keirsbilck, Matthijs
Artificial neural networks can be represented by paths. Generated as random walks on a dense network graph, we find that the resulting sparse networks allow for deterministic initialization and even weights with fixed sign. Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards. Although sparse, weights are accessed as contiguous blocks of memory. In addition, enumerating the paths using deterministic low discrepancy sequences, for example the Sobol' sequence, amounts to connecting the layers of neural units by progressive permutations, which naturally avoids bank conflicts in parallel computer hardware. We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
An effective theory of collective deep learning
Arola-Fernรกndez, Lluรญs, Lacasa, Lucas
Unraveling the emergence of collective learning in systems of coupled artificial neural networks points to broader implications for machine learning, neuroscience, and society. Here we introduce a minimal model that condenses several recent decentralized algorithms by considering a competition between two terms: the local learning dynamics in the parameters of each neural network unit, and a diffusive coupling among units that tends to homogenize the parameters of the ensemble. We derive an effective theory for linear networks to show that the coarse-grained behavior of our system is equivalent to a deformed Ginzburg-Landau model with quenched disorder. This framework predicts depth-dependent disorder-order-disorder phase transitions in the parameters' solutions that reveal a depth-delayed onset of a collective learning phase and a low-rank microscopic learning path. We validate the theory in coupled ensembles of realistic neural networks trained on the MNIST dataset under privacy constraints. Interestingly, experiments confirm that individual networks -- trained on private data -- can fully generalize to unseen data classes when the collective learning phase emerges. Our work establishes the physics of collective learning and contributes to the mechanistic interpretability of deep learning in decentralized settings.
Learning to Act through Evolution of Neural Diversity in Random Neural Networks
Pedersen, Joachim Winther, Risi, Sebastian
Biological nervous systems consist of networks of diverse, sophisticated information processors in the form of neurons of different classes. In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons within a layer or even the whole network; training of ANNs focuses on synaptic optimization. In this paper, we propose the optimization of neuro-centric parameters to attain a set of diverse neurons that can perform complex computations. Demonstrating the promise of the approach, we show that evolving neural parameters alone allows agents to solve various reinforcement learning tasks without optimizing any synaptic weights. While not aiming to be an accurate biological model, parameterizing neurons to a larger degree than the current common practice, allows us to ask questions about the computational abilities afforded by neural diversity in random neural networks. The presented results open up interesting future research directions, such as combining evolved neural diversity with activity-dependent plasticity.
Adaptive SpikeDeep-Classifier: Self-organizing and self-supervised machine learning algorithm for online spike sorting
Saif-ur-Rehman, Muhammad, Ali, Omair, Klaes, Christian, Iossifidis, Ioannis
Objective. Research on brain-computer interfaces (BCIs) is advancing towards rehabilitating severely disabled patients in the real world. Two key factors for successful decoding of user intentions are the size of implanted microelectrode arrays and a good online spike sorting algorithm. A small but dense microelectrode array with 3072 channels was recently developed for decoding user intentions. The process of spike sorting determines the spike activity (SA) of different sources (neurons) from recorded neural data. Unfortunately, current spike sorting algorithms are unable to handle the massively increasing amount of data from dense microelectrode arrays, making spike sorting a fragile component of the online BCI decoding framework. Approach. We proposed an adaptive and self-organized algorithm for online spike sorting, named Adaptive SpikeDeep-Classifier (Ada-SpikeDeepClassifier), which uses SpikeDeeptector for channel selection, an adaptive background activity rejector (Ada-BAR) for discarding background events, and an adaptive spike classifier (Ada-Spike classifier) for classifying the SA of different neural units. Results. Our algorithm outperformed our previously published SpikeDeep-Classifier and eight other spike sorting algorithms, as evaluated on a human dataset and a publicly available simulated dataset. Significance. The proposed algorithm is the first spike sorting algorithm that automatically learns the abrupt changes in the distribution of noise and SA. It is an artificial neural network-based algorithm that is well-suited for hardware implementation on neuromorphic chips that can be used for wearable invasive BCIs.