Towards Optimal Environmental Policies: Policy Learning under Arbitrary Bipartite Network Interference