Goto

Collaborating Authors

 comprehension






Towards

Neural Information Processing Systems

The Goldilocks phase is reminiscent of "intelligence from starvation" in Darwinian evolution, where resource limitations drivediscoveryofmore efficient solutions.



96671501524948bc3937b4b30d0e57b9-Paper.pdf

Neural Information Processing Systems

BERT is incapable of processing long texts due to its quadratically increasing memory andtimeconsumption. Themost natural waystoaddress thisproblem, such as slicing the text by a sliding window or simplifying transformers, suffer from insufficient long-range attentions orneed customized CUDAkernels.