Information Locality as an Inductive Bias for Neural Language Models

Open in new window