Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse