Plotting

 Eliasmith, Chris


Improving Rule-based Reasoning in LLMs via Neurosymbolic Representations

arXiv.org Artificial Intelligence

Large language models (LLMs) continue to face challenges in reliably solving reasoning tasks, particularly tasks that involve precise rule following, as often found in mathematical reasoning tasks. This paper introduces a novel neurosymbolic method that improves LLM reasoning by encoding hidden states into neurosymbolic vectors, allowing for problem-solving within a neurosymbolic vector space. The results are decoded and combined with the original hidden state, boosting the model's performance on numerical reasoning tasks. By offloading computation through neurosymbolic representations, this method improves efficiency, reliability, and interpretability. Our experimental results demonstrate an average of $82.86\%$ lower cross entropy loss and $24.50$ times more problems correctly solved on a suite of mathematical reasoning problems compared to chain-of-thought prompting and supervised fine-tuning (LoRA), while at the same time not hindering the performance of the LLM on other tasks.


Parallelizing Legendre Memory Unit Training

arXiv.org Artificial Intelligence

Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory component of the LMU to construct a simplified variant that can be parallelized during training (and yet executed as an RNN during inference), thus overcoming a well known limitation of training RNNs on GPUs. We show that this reformulation that aids parallelizing, which can be applied generally to any deep network whose recurrent components are linear, makes training up to 200 times faster. Second, to validate its utility, we compare its performance against the original LMU and a variety of published LSTM and transformer networks on seven benchmarks, ranging from psMNIST to sentiment analysis to machine translation. We demonstrate that our models exhibit superior performance on all datasets, often using fewer parameters. For instance, our LMU sets a new state-of-the-art result on psMNIST, and uses half the parameters while outperforming DistilBERT and LSTM models on IMDB sentiment analysis.


Hardware Aware Training for Efficient Keyword Spotting on General Purpose and Specialized Hardware

arXiv.org Machine Learning

Keyword spotting (KWS) provides a critical user interface for many mobile and edge applications, including phones, wearables, and cars. As KWS systems are typically 'always on', maximizing both accuracy and power efficiency are central to their utility. In this work we use hardware aware training (HAT) to build new KWS neural networks based on the Legendre Memory Unit (LMU) that achieve state-of-the-art (SotA) accuracy and low parameter counts. This allows the neural network to run efficiently on standard hardware (212$\mu$W). We also characterize the power requirements of custom designed accelerator hardware that achieves SotA power efficiency of 8.79$\mu$W, beating general purpose low power hardware (a microcontroller) by 24x and special purpose ASICs by 16x.


Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks

Neural Information Processing Systems

We propose a novel memory cell for recurrent neural networks that dynamically maintains information across long windows of time using relatively few resources. The Legendre Memory Unit (LMU) is mathematically derived to orthogonalize its continuous-time history -- doing so by solving $d$ coupled ordinary differential equations (ODEs), whose phase space linearly maps onto sliding windows of time via the Legendre polynomials up to degree $d - 1$. Backpropagation across LMUs outperforms equivalently-sized LSTMs on a chaotic time-series prediction task, improves memory capacity by two orders of magnitude, and significantly reduces training and inference times. LMUs can efficiently handle temporal dependencies spanning $100\text{,}000$ time-steps, converge rapidly, and use few internal state-variables to learn complex functions spanning long windows of time -- exceeding state-of-the-art performance among RNNs on permuted sequential MNIST. These results are due to the network's disposition to learn scale-invariant features independently of step size.


Passive nonlinear dendritic interactions as a general computational resource in functional spiking neural networks

arXiv.org Artificial Intelligence

Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks tend to assume linear, current-based superposition of post-synaptic currents. We extend the theory underlying the Neural Engineering Framework to systematically exploit nonlinear interactions between the local membrane potential and conductance-based synaptic channels as a computational resource. In particular, we demonstrate that even a single passive distal dendritic compartment with AMPA and GABA-A synapses connected to a leaky integrate-and-fire neuron supports the computation of a wide variety of multivariate, bandlimited functions, including the Euclidean norm, controlled shunting, and non-negative multiplication. Our results demonstrate that, for certain operations, the accuracy of dendritic computation is on a par with or even surpasses the accuracy of an additional layer of neurons in the network. These findings allow modelers to construct large-scale models of neurobiological systems that closer approximate network topologies and computational resources available in biology. Our results may inform neuromorphic hardware design and could lead to a better utilization of resources on existing neuromorphic hardware platforms.


Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware

arXiv.org Machine Learning

Using Intel's Loihi neuromorphic research chip and ABR's Nengo Deep Learning toolkit, we analyze the inference speed, dynamic power consumption, and energy cost per inference of a two-layer neural network keyword spotter trained to recognize a single phrase. We perform comparative analyses of this keyword spotter running on more conventional hardware devices including a CPU, a GPU, Nvidia's Jetson TX1, and the Movidius Neural Compute Stick. Our results indicate that for this inference application, Loihi outperforms all of these alternatives on an energy cost per inference basis while maintaining near-equivalent inference accuracy. Furthermore, an analysis of tradeoffs between network size, inference speed, and energy cost indicates that Loihi's comparative advantage over other low-power computing devices improves for larger networks.


Continuous and Parallel: Challenges for a Standard Model of the Mind

AAAI Conferences

We believe that a Standard Model of the Mind should take into account continuous state representations, continuous timing, continuous actions, continuous learning, and parallel control loops. For each of these, we describe initial models that we have made exploring these directions. While we have demonstrated that it is possible to construct high-level cognitive models with these features (which are uncommon in most cognitive modeling approaches), there are many theoretical challenges still to be faced to allow these features to interact in useful ways and to characterize what may be gained by including these features.


A Brain-Machine Interface Operating with a Real-Time Spiking Neural Network Control Algorithm

Neural Information Processing Systems

Motor prostheses aim to restore function to disabled patients. Despite compelling proof of concept systems, barriers to clinical translation remain. One challenge is to develop a low-power, fully-implantable system that dissipates only minimal power so as not to damage tissue. To this end, we implemented a Kalman-filter based decoder via a spiking neural network (SNN) and tested it in brain-machine interface (BMI) experiments with a rhesus monkey. The Kalman filter was trained to predict the arm's velocity and mapped on to the SNN using the Neural Engineering Framework(NEF). A 2,000-neuron embedded Matlab SNN implementation runs in real-time and its closed-loop performance is quite comparable to that of the standard Kalman filter. The success of this closed-loop decoder holds promise for hardware SNN implementations of statistical signal processing algorithms on neuromorphic chips,which may offer power savings necessary to overcome a major obstacle to the successful clinical translation of neural motor prostheses. Present: Research Fellow F.R.S.-FNRS, Systmod Unit, University of Liege, Belgium.


Reports on the 2004 AAAI Fall Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence presented its 2004 Fall Symposium Series Friday through Sunday, October 22-24 at the Hyatt Regency Crystal City in Arlington, Virginia, adjacent to Washington, DC. The symposium series was preceded by a one-day AI funding seminar. The topics of the eight symposia in the 2004 Fall Symposia Series were: (1) Achieving Human-Level Intelligence through Integrated Systems and Research; (2) Artificial Multiagent Learning; (3) Compositional Connectionism in Cognitive Science; (4) Dialogue Systems for Health Communications; (5) The Intersection of Cognitive Science and Robotics: From Interfaces to Intelligence; (6) Making Pen-Based Interaction Intelligent and Natural; (7) Real- Life Reinforcement Learning; and (8) Style and Meaning in Language, Art, Music, and Design.


Reports on the 2004 AAAI Fall Symposia

AI Magazine

Learning) are also available as AAAI be integrated and (2) architectures Technical Reports. There through Sunday, October 22-24 at an opportunity for new and junior researchers--as was consensus among participants the Hyatt Regency Crystal City in Arlington, well as students and that metrics in machine learning, Virginia, adjacent to Washington, postdoctoral fellows--to get an inside planning, and natural language processing DC. The symposium series was look at what funding agencies expect have driven advances in those preceded on Thursday, October 21 by in proposals from prospective subfields, but that those metrics have a one-day AI funding seminar, which grantees. Representatives and program also distracted attention from how to was open to all registered attendees. The topic is of increasing interest Domains for motivating, testing, large numbers of agents, more complex with the advent of peer-to-peer network and funding this research were agent behaviors, partially observable services and with ad-hoc wireless proposed (some during our joint session environments, and mutual adaptation.