Adaptive Learning of Design Strategies over Non-Hierarchical Multi-Fidelity Models via Policy Alignment