Attention Learning is Needed to Efficiently Learn Parity Function