CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning