Parallel Heuristic Search as Inference for Actor-Critic Reinforcement Learning Models

Open in new window