On Predictive Information Sub-optimality of RNNs