Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization