Goto

Collaborating Authors

 circuitry


Hypothesis Testing the Circuit Hypothesis in LLMs

Neural Information Processing Systems

Large language models (LLMs) demonstrate surprising capabilities, but we do not understand how they are implemented. One hypothesis suggests that these capabilities are primarily executed by small subnetworks within the LLM, known as circuits. But how can we evaluate this hypothesis?In this paper, we formalize a set of criteria that a circuit is hypothesized to meet and develop a suite of hypothesis tests to evaluate how well circuits satisfy them. The criteria focus on the extent to which the LLM's behavior is preserved, the degree of localization of this behavior, and whether the circuit is minimal.We apply these tests to six circuits described in the research literature. We find that synthetic circuits -- circuits that are hard-coded in the model -- align with the idealized properties. Circuits discovered in Transformer models satisfy the criteria to varying degrees.To facilitate future empirical studies of circuits, we created the \textit{circuitry} package, a wrapper around the \textit{TransformerLens} library, which abstracts away lower-level manipulations of hooks and activations.


Neural Circuit Architectural Priors for Embodied Control

Neural Information Processing Systems

Artificial neural networks for motor control usually adopt generic architectures like fully connected MLPs. While general, these tabula rasa architectures rely on large amounts of experience to learn, are not easily transferable to new bodies, and have internal dynamics that are difficult to interpret. In nature, animals are born with highly structured connectivity in their nervous systems shaped by evolution; this innate circuitry acts synergistically with learning mechanisms to provide inductive biases that enable most animals to function well soon after birth and learn efficiently. Convolutional networks inspired by visual circuitry have encoded useful biases for vision. However, it is unknown the extent to which ANN architectures inspired by neural circuitry can yield useful biases for other AI domains. In this work, we ask what advantages biologically inspired ANN architecture can provide in the domain of motor control.


a simpler model and that the brain is complex, so it is not exactly clear why simpler models would be preferred

Neural Information Processing Systems

We would like to thank all reviewers for their comments and helpful feedback. Simplicity was meant to quantify this. We agree with both of those points. Figure 1 - these steps were the most minimal configuration that produced the best model as determined by our scores. Training for more recurrent steps is possible, but at least on our current set of scores, we see no improvement.



The futuristic new tech that could bridge broken nerves and mend minds

New Scientist

Ian Burkhart was on holiday with friends in 2010 when his life changed forever. He dived into shallow water and broke his neck, leaving him paralysed from the shoulders down at the age of 19. "At that point, I was getting assistance with everything," he says, "even being able to scratch an itch on my forehead." A few years later, Burkhart got an experimental brain implant that rerouted nerve impulses around his broken spinal cord to the muscles of his arm. It took time, but eventually he was able to use his hands and arms again – and even play the video game Guitar Hero.


Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing

Luquin, J., Mackin, C., Ambrogio, S., Chen, A., Baldi, F., Miralles, G., Rasch, M. J., Büchel, J., Lalwani, M., Ponghiran, W., Solomon, P., Tsai, H., Burr, G. W., Narayanan, P.

arXiv.org Artificial Intelligence

Analog In-Memory Compute (AIMC) can improve the energy efficiency of Deep Learning by orders of magnitude. Yet analog-domain device and circuit non-idealities -- within the analog ``Tiles'' performing Matrix-Vector Multiply (MVM) operations -- can degrade neural-network task accuracy. We quantify the impact of low-level distortions and noise, and develop a mathematical model for Multiply-ACcumulate (MAC) operations mapped to analog tiles. Instantaneous-current IR-drop (the most significant circuit non-ideality), and ADC quantization effects are fully captured by this model, which can predict MVM tile-outputs both rapidly and accurately, as compared to much slower rigorous circuit simulations. A statistical model of PCM read noise at nanosecond timescales is derived from -- and matched against -- experimental measurements. We integrate these (statistical) device and (deterministic) circuit effects into a PyTorch-based framework to assess the accuracy impact on the BERT and ALBERT Transformer networks. We show that hardware-aware fine-tuning using simple Gaussian noise provides resilience against ADC quantization and PCM read noise effects, but is less effective against IR-drop. This is because IR-drop -- although deterministic -- is non-linear, is changing significantly during the time-integration window, and is ultimately dependent on all the excitations being introduced in parallel into the analog tile. The apparent inability of simple Gaussian noise applied during training to properly prepare a DNN network for IR-drop during inference implies that more complex training approaches -- incorporating advances such as the Tile-circuit model introduced here -- will be critical for resilient deployment of large neural networks onto AIMC hardware.


Hypothesis Testing the Circuit Hypothesis in LLMs

Neural Information Processing Systems

Large language models (LLMs) demonstrate surprising capabilities, but we do not understand how they are implemented. One hypothesis suggests that these capabilities are primarily executed by small subnetworks within the LLM, known as circuits. But how can we evaluate this hypothesis?In this paper, we formalize a set of criteria that a circuit is hypothesized to meet and develop a suite of hypothesis tests to evaluate how well circuits satisfy them. The criteria focus on the extent to which the LLM's behavior is preserved, the degree of localization of this behavior, and whether the circuit is minimal.We apply these tests to six circuits described in the research literature. We find that synthetic circuits -- circuits that are hard-coded in the model -- align with the idealized properties.


Neural Circuit Architectural Priors for Embodied Control

Neural Information Processing Systems

Artificial neural networks for motor control usually adopt generic architectures like fully connected MLPs. While general, these tabula rasa architectures rely on large amounts of experience to learn, are not easily transferable to new bodies, and have internal dynamics that are difficult to interpret. In nature, animals are born with highly structured connectivity in their nervous systems shaped by evolution; this innate circuitry acts synergistically with learning mechanisms to provide inductive biases that enable most animals to function well soon after birth and learn efficiently. Convolutional networks inspired by visual circuitry have encoded useful biases for vision. However, it is unknown the extent to which ANN architectures inspired by neural circuitry can yield useful biases for other AI domains. In this work, we ask what advantages biologically inspired ANN architecture can provide in the domain of motor control.


Why every arm of an octopus moves with a mind of its own

Popular Science

There are many remarkable things about octopuses--they're famously intelligent, they have three hearts, their eyeballs work like prisms, they can change color at will, and they can "see" light with their skin. One of the most striking things about these creatures, however, is the fact that each of their eight arms almost seems to have a mind of its own, allowing an octopus to multitask in a manner that humans can only dream about. At the heart of each arm is a structure known as the axial nervous cord (ANC), and a new study published January 15 in Nature Communications examines how the structure of this cord is fundamental to allowing the arms to act as they do. Cassady Olson, first author on the paper, explains to Popular Science that understanding the ANC is crucial to understanding how an octopus's arms work: "You can think of the ANC as equivalent to a spinal cord running down the center of every single arm." Olson explains that "there are many gross similarities [between the ANC and vertebrates' spinal cords]--there is a cell body region, a neuropil region, and long tracts to connect the arms and brains in each."


TEXEL: A neuromorphic processor with on-chip learning for beyond-CMOS device integration

Greatorex, Hugh, Richter, Ole, Mastella, Michele, Cotteret, Madison, Klein, Philipp, Fabre, Maxime, Rubino, Arianna, Girão, Willian Soares, Chen, Junren, Ziegler, Martin, Bégon-Lours, Laura, Indiveri, Giacomo, Chicca, Elisabetta

arXiv.org Artificial Intelligence

Recent advances in memory technologies, devices and materials have shown great potential for integration into neuromorphic electronic systems. However, a significant gap remains between the development of these materials and the realization of large-scale, fully functional systems. One key challenge is determining which devices and materials are best suited for specific functions and how they can be paired with CMOS circuitry. To address this, we introduce TEXEL, a mixed-signal neuromorphic architecture designed to explore the integration of on-chip learning circuits and novel two- and three-terminal devices. TEXEL serves as an accessible platform to bridge the gap between CMOS-based neuromorphic computation and the latest advancements in emerging devices. In this paper, we demonstrate the readiness of TEXEL for device integration through comprehensive chip measurements and simulations. TEXEL provides a practical system for testing bio-inspired learning algorithms alongside emerging devices, establishing a tangible link between brain-inspired computation and cutting-edge device research.