Adaptive Estimation and Optimal Control in Offline Contextual MDPs without Stationarity