Learning Hard Optimization Problems: A Data Generation Perspective