Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback