On Difficulties of Attention Factorization through Shared Memory