Off-Policy Evaluation and Learning for the Future under Non-Stationarity