Vertical Symbolic Regression via Deep Policy Gradient