A Policy Gradient for Sub task Tree