SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers