Appendices
–Neural Information Processing Systems
When e 6 WΦ, we have E = Rd and WΦ,E = WΦ. By Theorem 1 in [10], we know that the projected Bellman equation (3.4) has a unique fixed point θ . Thus, L= {θ }. 2. When e WΦ, θe is a unique solution to Φθ = eas Φ is full column rank. We first show that the set of solutions to the projected Bellman equation (3.4) takes the form { θ+ cθe|c R}, where θis any solution to (3.4). On the other hand, suppose that θis not of the form θ+ cθe.
Neural Information Processing Systems
Apr-24-2026, 14:35:09 GMT
- Technology: