simple neuron
Uncovering a Universal Abstract Algorithm for Modular Addition in Neural Networks
McCracken, Gavin, Moisescu-Pareja, Gabriela, Letourneau, Vincent, Precup, Doina, Love, Jonathan
We propose a testable universality hypothesis, asserting that seemingly disparate neural network solutions observed in the simple task of modular addition are unified under a common abstract algorithm. While prior work interpreted variations in neuron-level representations as evidence for distinct algorithms, we demonstrate - through multi-level analyses spanning neurons, neuron clusters, and entire networks - that multilayer perceptrons and transformers universally implement the abstract algorithm we call the approximate Chinese Remainder Theorem. Crucially, we introduce approximate cosets and show that neurons activate exclusively on them. Furthermore, our theory works for deep neural networks (DNNs). It predicts that universally learned solutions in DNNs with trainable embeddings or more than one hidden layer require only O(log n) features, a result we empirically confirm. This work thus provides the first theory-backed interpretation of multilayer networks solving modular addition. It advances generalizable interpretability and opens a testable universality hypothesis for group multiplication beyond modular addition.
Investigating Recurrent Neural Network Memory Structures using Neuro-Evolution
Ororbia, Alexander, Elsaid, Ahmed Ahmed, Desell, Travis
This paper presents a new algorithm, Evolutionary eXploration of Augmenting Memory Models (EXAMM), which is capable of evolving recurrent neural networks (RNNs) using a wide variety of memory structures, such as Delta-RNN, GRU, LSTM, MGU and UGRNN cells. EXAMM evolved RNNs to perform prediction of large-scale, real world time series data from the aviation and power industries. These data sets consist of very long time series (thousands of readings), each with a large number of potentially correlated and dependent parameters. Four different parameters were selected for prediction and EXAMM runs were performed using each memory cell type alone, each cell type with feed forward nodes, and with all possible memory cell types. Evolved RNN performance was measured using repeated k-fold cross validation, resulting in 1210 EXAMM runs which evolved 2,420,000 RNNs in 12,100 CPU hours on a high performance computing cluster. Generalization of the evolved RNNs was examined statistically, providing interesting findings that can help refine the RNN memory cell design as well as inform future neuro-evolution algorithms development.
Learning Features and their Transformations by Spatial and Temporal Spherical Clustering
Dutta, Jayanta K., Banerjee, Bonny
Learning features invariant to arbitrary transformations in the data is a requirement for any recognition system, biological or artificial. It is now widely accepted that simple cells in the primary visual cortex respond to features while the complex cells respond to features invariant to different transformations. We present a novel two-layered feedforward neural model that learns features in the first layer by spatial spherical clustering and invariance to transformations in the second layer by temporal spherical clustering. Learning occurs in an online and unsupervised manner following the Hebbian rule. When exposed to natural videos acquired by a camera mounted on a cat's head, the first and second layer neurons in our model develop simple and complex cell-like receptive field properties. The model can predict by learning lateral connections among the first layer neurons. A topographic map to their spatial features emerges by exponentially decaying the flow of activation with distance from one neuron to another in the first layer that fire in close temporal proximity, thereby minimizing the pooling length in an online manner simultaneously with feature learning.