AITopics | sequence learning

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Neural Information Processing SystemsDec-24-2025, 03:02:27 GMT

Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector. By contrast, Transformers have little inductive bias towards learning temporally compressed representations, as they allow for attention over all previously computed elements in a sequence. Having a more compressed representation of a sequence may be beneficial for generalization, as a high-level representation may be more easily re-used and re-purposed and will contain fewer irrelevant details. At the same time, excessive compression of representations comes at the cost of expressiveness. We propose a solution which divides computation into two streams. A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors. At the same time, a fast stream is parameterized as a Transformer to process chunks consisting of $K$ time-steps conditioned on the information in the slow-stream. In the proposed approach we hope to gain the expressiveness of the Transformer, while encouraging better compression and structuring of representations in the slow stream. We show the benefits of the proposed method in terms of improved sample efficiency and generalization performance as compared to various competitive baselines for visual perception and sequential decision making tasks.

fast and slow processing mechanism, representation, temporal latent bottleneck, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

Sequence to Sequence Learning with Neural Networks

Neural Information Processing SystemsSep-30-2025, 09:18:19 GMT

Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words.

name change, neural network, sequence learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Neural Information Processing SystemsMay-27-2025, 01:39:05 GMT

Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector. By contrast, Transformers have little inductive bias towards learning temporally compressed representations, as they allow for attention over all previously computed elements in a sequence. Having a more compressed representation of a sequence may be beneficial for generalization, as a high-level representation may be more easily re-used and re-purposed and will contain fewer irrelevant details. At the same time, excessive compression of representations comes at the cost of expressiveness. We propose a solution which divides computation into two streams.

artificial intelligence, machine learning, representation, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Neural Information Processing SystemsOct-10-2024, 20:43:06 GMT

Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector. By contrast, Transformers have little inductive bias towards learning temporally compressed representations, as they allow for attention over all previously computed elements in a sequence. Having a more compressed representation of a sequence may be beneficial for generalization, as a high-level representation may be more easily re-used and re-purposed and will contain fewer irrelevant details. At the same time, excessive compression of representations comes at the cost of expressiveness. We propose a solution which divides computation into two streams.

fast and slow processing mechanism, representation, temporal latent bottleneck, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Reviews: Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning

Neural Information Processing SystemsOct-8-2024, 00:33:52 GMT

This paper proposes Tensorized LSTMs for efficient sequence learning. It represents hidden layers as tensors, and employs cross-layer memory cell convolution for efficiency and effectiveness. The model is clearly formulated. Experimental results show the utility of the proposed method. Although the paper is well written, I still have some questions/confusion as follows.

artificial intelligence, convolution, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.38)

Industry: Energy > Oil & Gas > Upstream (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hyperdimensional Vector Tsetlin Machines with Applications to Sequence Learning and Generation

Blakely, Christian D.

arXiv.org Artificial IntelligenceAug-29-2024

A large part of any design of a data learning agent is in feature extraction of the underlying data, and how it is computed and represented. The best processes for extracting features for learning information from data typically take advantage of expert knowledge of the underlying data to either expose the most relevant features, reduced noise, and extract the most amount of independent information in the data. For many types of datasets, this might be challenging due to factors such as incoherence, abstractedness, or the sheer amount of noise present in the data. In designing features for Tsetlin machines, one is tasked to booleanize (or binarize) the underlying data, and under the presence of noise, this can be challenging. Furthermore, for notoriously complex high-dimensional data like noisy sequences, graphs, images, signal spectra, and natural language, creating encodings that are also interpretable for human reasoning in any post-hoc process can be difficult due to creating logic AND expressions that both take advantage of the relevant information in the data, but also lead to accurate expressions that can compete with other machine learning models. In this paper, we explore using Hyperdimensional Vector Computing (HV computing, or simply HVC) as an input layer to a novel Tsetlin machine architecture and apply it to learning, classifying, predicting, and generating sequences. Here, we argue that HVC can provide a robust layer of feature extraction due to the many computational advantages. This approach was first introduced in [1] and here, we streamline the approach to focus on sequences while further leveraging other attributes of HCV such as N-Gram sequence encoding and associative memory, while combining with TMs, to create a powerful hybrid methodology while remaining minimalist in memory sizes of the overall model.

hyperdimensional vector tsetlin machine, sequence, vector, (13 more...)

arXiv.org Artificial Intelligence

2408.1662

Country: Europe > Norway (0.04)

Genre:

Research Report (0.50)
Overview (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Chen, Yiwen, Wang, Yikai, Luo, Yihao, Wang, Zhengyi, Chen, Zilong, Zhu, Jun, Zhang, Chi, Lin, Guosheng

arXiv.org Artificial IntelligenceAug-5-2024

We introduce MeshAnything V2, an autoregressive transformer that generates Artist-Created Meshes (AM) aligned to given shapes. It can be integrated with various 3D asset production pipelines to achieve high-quality, highly controllable AM generation. MeshAnything V2 surpasses previous methods in both efficiency and performance using models of the same size. These improvements are due to our newly proposed mesh tokenization method: Adjacent Mesh Tokenization (AMT). Different from previous methods that represent each face with three vertices, AMT uses a single vertex whenever possible. Compared to previous methods, AMT requires about half the token sequence length to represent the same mesh in average. Furthermore, the token sequences from AMT are more compact and well-structured, fundamentally benefiting AM generation. Our extensive experiments show that AMT significantly improves the efficiency and performance of AM generation. Project Page: https://buaacyw.github.io/meshanything-v2/

mesh, sequence, vertex, (12 more...)

arXiv.org Artificial Intelligence

2408.02555

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Sequence learning with hidden units in spiking neural networks

Neural Information Processing SystemsMar-15-2024, 04:57:16 GMT

hidden unit, sequence learning, spiking neural network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

On learning spatial sequences with the movement of attention

Osaulenko, Viacheslav M.

arXiv.org Artificial IntelligenceNov-12-2023

In this paper we start with a simple question, how is it possible that humans can recognize different movements over skin with only a prior visual experience of them? Or in general, what is the representation of spatial sequences that are invariant to scale, rotation, and translation across different modalities? To answer, we rethink the mathematical representation of spatial sequences, argue against the minimum description length principle, and focus on the movements of attention. We advance the idea that spatial sequences must be represented on different levels of abstraction, this adds redundancy but is necessary for recognition and generalization. To address the open question of how these abstractions are formed we propose two hypotheses: the first invites exploring selectionism learning, instead of finding parameters in some models; the second proposes to find new data structures, not neural network architectures, to efficiently store and operate over redundant features to be further selected. Movements of attention are central to human cognition and lessons should be applied to new better learning algorithms.

representation, sequence, spatial sequence, (17 more...)

arXiv.org Artificial Intelligence

2311.06856

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)

Add feedback

Sequence to Sequence (Seq2Seq) models

#artificialintelligenceJan-3-2023, 06:35:09 GMT

When learning about time-series models, we might have come across various sequence learning problems such as Stock Market Prediction, Story Telling, and AutoComplete that could be learnt by traditional neural networks like the Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM). But, will these Artificial Neural Networks (ANNs) work even when the problem gets complicated? Seq2Seq models are a special class of models that make minimal sequence structure assumptions and could be used to solve complex sequence problems. Let's discuss the Seq2Seq models on the following topics. Which of these tasks do you think could be solved by a single neural network?

neural network, seq2seq model, sequence, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

sequence learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Sequence to Sequence Learning with Neural Networks

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

Reviews: Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning

Hyperdimensional Vector Tsetlin Machines with Applications to Sequence Learning and Generation

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Sequence learning with hidden units in spiking neural networks

On learning spatial sequences with the movement of attention

Sequence to Sequence (Seq2Seq) models