ExplainMySurprise: LearningEfficientLong-Term MemorybyPredictingUncertainOutcomes

Neural Information Processing Systems 

In many sequential tasks, a model needs to remember relevant events from the distant past to make correct predictions. Unfortunately, a straightforward application ofgradient based training requires intermediate computations tobestored for every element of a sequence. This requires to store prohibitively large intermediate data ifasequence consists ofthousands oreven millions elements, and asaresult, makeslearning ofverylong-term dependencies infeasible.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found