A Additional Results

Neural Information Processing Systems 

We use a sliding window approach to generate samples of sequences. We forecast in an autoregressive manner to generate multi-step ahead predictions. We compare our model with a series of baselines on the multi-step forecasting with different dynamics. Apart from the root mean square error (RMSE), we also report the energy spectrum error (ESE) for ocean current prediction which quantifies the physical consistency. Our model achieves this by learning on multiple tasks simultaneously and then adapting to new tasks with domain transfer.