eliasmith
DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales
After enough time has elapsed, the events that were presented close in time will gradually blend together, as illustrated in the bottom panel of Figure 1. We used the parameters presented in that work for the experiment with the adding problem used here. Table 1: Parameter values used for LSTM networks. Table 2: Parameter values used for LMU networks. "Coupled Oscillatory Recurrent Neural Network (coRNN): An Accurate and (Gradient) Stable Architecture for Learning Long Time Dependencies."
- North America > United States > Virginia (0.06)
- North America > United States > Indiana (0.05)
DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales
After enough time has elapsed, the events that were presented close in time will gradually blend together, as illustrated in the bottom panel of Figure 1. We used the parameters presented in that work for the experiment with the adding problem used here. Table 1: Parameter values used for LSTM networks. Table 2: Parameter values used for LMU networks. "Coupled Oscillatory Recurrent Neural Network (coRNN): An Accurate and (Gradient) Stable Architecture for Learning Long Time Dependencies."
- North America > United States > Virginia (0.06)
- North America > United States > Indiana (0.05)
Improved Cleanup and Decoding of Fractional Power Encodings
High-dimensional vectors have been proposed as a neural method for representing information in the brain using Vector Symbolic Algebras (VSAs). While previous work has explored decoding and cleaning up these vectors under the noise that arises during computation, existing methods are limited. Cleanup methods are essential for robust computation within a VSA. However, cleanup methods for continuous-value encodings are not as effective. In this paper, we present an iterative optimization method to decode and clean up Fourier Holographic Reduced Representation (FHRR) vectors that are encoding continuous values. We combine composite likelihood estimation (CLE) and maximum likelihood estimation (MLE) to ensure convergence to the global optimum. We also demonstrate that this method can effectively decode FHRR vectors under different noise conditions, and show that it outperforms existing methods.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Analogical Reasoning Within a Conceptual Hyperspace
Goldowsky, Howard, Sarathy, Vasanth
We propose an approach to analogical inference that marries the neuro-symbolic computational power of complex-sampled hyperdimensional computing (HDC) with Conceptual Spaces Theory (CST), a promising theory of semantic meaning. CST sketches, at an abstract level, approaches to analogical inference that go beyond the standard predicate-based structure mapping theories. But it does not describe how such an approach can be operationalized. We propose a concrete HDC-based architecture that computes several types of analogy classified by CST. We present preliminary proof-of-concept experimental results within a toy domain and describe how it can perform category-based and property-based analogical reasoning.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images
Penzkofer, Anna, Shi, Lei, Bulling, Andreas
While Vector Symbolic Architectures (VSAs) are promising for modelling spatial cognition, their application is currently limited to artificially generated images and simple spatial queries. We propose VSA4VQA - a novel 4D implementation of VSAs that implements a mental representation of natural images for the challenging task of Visual Question Answering (VQA). VSA4VQA is the first model to scale a VSA to complex spatial queries. Our method is based on the Semantic Pointer Architecture (SPA) to encode objects in a hyperdimensional vector space. To encode natural images, we extend the SPA to include dimensions for object's width and height in addition to their spatial location. To perform spatial queries we further introduce learned spatial query masks and integrate a pre-trained vision-language model for answering attribute-related questions. We evaluate our method on the GQA benchmark dataset and show that it can effectively encode natural images, achieving competitive performance to state-of-the-art deep learning methods for zero-shot VQA.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part II: Applications, Cognitive Models, and Challenges
Kleyko, Denis, Rachkovskij, Dmitri A., Osipov, Evgeny, Rahimi, Abbas
This is Part II of the two-part comprehensive survey devoted to a computing framework most commonly known under the names Hyperdimensional Computing and Vector Symbolic Architectures (HDC/VSA). Both names refer to a family of computational models that use high-dimensional distributed representations and rely on the algebraic properties of their key operations to incorporate the advantages of structured symbolic representations and vector distributed representations. Holographic Reduced Representations is an influential HDC/VSA model that is well-known in the machine learning domain and often used to refer to the whole family. However, for the sake of consistency, we use HDC/VSA to refer to the area. Part I of this survey covered foundational aspects of the area, such as historical context leading to the development of HDC/VSA, key elements of any HDC/VSA model, known HDC/VSA models, and transforming input data of various types into high-dimensional vectors suitable for HDC/VSA. This second part surveys existing applications, the role of HDC/VSA in cognitive computing and architectures, as well as directions for future work. Most of the applications lie within the machine learning/artificial intelligence domain, however we also cover other applications to provide a thorough picture. The survey is written to be useful for both newcomers and practitioners.
- North America > United States > California > Alameda County > Berkeley (0.14)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- (10 more...)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- (12 more...)
Parallelizing Legendre Memory Unit Training
Chilkuri, Narsimha, Eliasmith, Chris
Recently, a new recurrent neural network (RNN) named the Legendre Memory Unit (LMU) was proposed and shown to achieve state-of-the-art performance on several benchmark datasets. Here we leverage the linear time-invariant (LTI) memory component of the LMU to construct a simplified variant that can be parallelized during training (and yet executed as an RNN during inference), thus overcoming a well known limitation of training RNNs on GPUs. We show that this reformulation that aids parallelizing, which can be applied generally to any deep network whose recurrent components are linear, makes training up to 200 times faster. Second, to validate its utility, we compare its performance against the original LMU and a variety of published LSTM and transformer networks on seven benchmarks, ranging from psMNIST to sentiment analysis to machine translation. We demonstrate that our models exhibit superior performance on all datasets, often using fewer parameters. For instance, our LMU sets a new state-of-the-art result on psMNIST, and uses half the parameters while outperforming DistilBERT and LSTM models on IMDB sentiment analysis.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > India (0.04)
- North America > United States > Oregon > Clackamas County > West Linn (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada > Ontario (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Information Technology (0.70)
- Health & Medicine (0.49)
Are Better Machine Training Approaches Ahead?
We live in a time of unparalleled use of machine learning (ML), but it relies on one approach to training the models that are implemented in artificial neural networks (ANNs) -- so named because they're not neuromorphic. But other training approaches, some of which are more biomimetic than others, are being developed. The big question remains whether any of them will become commercially viable. ML training frequently is divided into two camps -- supervised and unsupervised. As it turns out, the divisions are not so clear-cut. The variety of approaches that exists defies neat pigeonholing.
A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions
Voelker, Aaron R., Rasmussen, Daniel, Eliasmith, Chris
The machine learning community has become increasingly interested in the energy efficiency of neural networks. The Spiking Neural Network (SNN) is a promising approach to energy-efficient computing, since its activation levels are quantized into temporally sparse, one-bit values (i.e., "spike" events), which additionally converts the sum over weight-activity products into a simple addition of weights (one weight for each spike). However, the goal of maintaining state-of-the-art (SotA) accuracy when converting a non-spiking network into an SNN has remained an elusive challenge, primarily due to spikes having only a single bit of precision. Adopting tools from signal processing, we cast neural activation functions as quantizers with temporally-diffused error, and then train networks while smoothly interpolating between the non-spiking and spiking regimes. We apply this technique to the Legendre Memory Unit (LMU) to obtain the first known example of a hybrid SNN outperforming SotA recurrent architectures---including the LSTM, GRU, and NRU---in accuracy, while reducing activities to at most 3.74 bits on average with 1.26 significant bits multiplying each weight. We discuss how these methods can significantly improve the energy efficiency of neural networks.
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)