Revisiting Feature Prediction for Learning Visual Representations from Video