functional time representation learning
Self-attention with Functional Time Representation Learning
Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces. By constructing the associated translation-invariant time kernel function, we reveal the functional forms of the feature map under classic functional function analysis results, namely Bochner's Theorem and Mercer's Theorem. We propose several models to learn the functional time representation and the interactions with event representation. These methods are evaluated on real-world datasets under various continuous-time event sequence prediction tasks. The experiments reveal that the proposed methods compare favorably to baseline models while also capture useful time-event interactions.
Reviews: Self-attention with Functional Time Representation Learning
Originality: The application of self-attention in continuous-time event sequences is an interesting approach. The authors clearly note the shortcoming of self-attention when applied to such problems. They propose translation-invariant time kernel functions justified by classic function analysis theories and implement 4 new time embeddings that can be optimized by backpropagation and are compatible with self-attention. I believe the proposed time embeddings are novel and generalizable to other temporal tasks. Quality: Motivation from classic functional analysis theory [12] and [14]and developing differentiable time embeddings is the key contribution of this paper.
Self-attention with Functional Time Representation Learning
Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces.
Self-attention with Functional Time Representation Learning
Xu, Da, Ruan, Chuanwei, Korpeoglu, Evren, Kumar, Sushant, Achan, Kannan
Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces.