[D] Future of LSTM and GRU given rise of causal convolution? • r/MachineLearning
I am currently of the opinion that unbounded receptive field of RNNs is often a curse, have tried many models where hard truncation (resetting the memory) at a fixed or even random interval was important to get it to actually work in generation. I think a lot of what people care about in generative models are more like "medium term" dependencies (more exactly, do true "long term dependencies" exist? At least one case in particular here is burned in my brain forever. Hierarchies are often useful, whether you get it from multiple RNNs and skip connections, HM-RNN, SampleRNN, fixed interval hidden passing from a fast RNN to "slow" one, WaveNet style dilated convolutions, or in more roundabout ways using trees, memory, stacks, etc. One really interesting part of these convolutional generative models was pointed out to me by Laurent Dinh, I mention it in this review of PixelRNN/CNN - growing the dependency chain over depth makes tons of sense for a lot of problems, and is a general idea that is useful in a ton of domains.
Dec-24-2017, 17:30:31 GMT
- Technology: