Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization