Learning to Construct Knowledge through Sparse Reference Selection with Reinforcement Learning