Reviews: Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.
–Neural Information Processing Systems
Two or three relevant citations: Transformer models should probably be mentioned in the section on "models designed specifically for use on sequences", since they are competing heavily with the referenced baselines on NLP tasks especially. I believe your numbers on the Yelp dataset compare very favorably to the "sentiment neuron" work from Radford et al https://arxiv.org/abs/1704.01444 - that could be a nice addition and add further external context to your results. Some questions about the architecture, particularly the importance of the "additive skip connection" from input to output - how crucial is this connection, since it somewhat allows the network to bypass the TFiLM layers entirely? Does using a stacked skip (with free trainable parameters) still work, or does it hurt network training / break it completely? What is the SNR of the cubic interpolation used as input for the audio experiments?
Neural Information Processing Systems
Jan-22-2025, 11:51:29 GMT
- Technology: