A Implementation details

Neural Information Processing Systems 

We set the diffusion steps of both diffusion models to M = 40. We use Adam [6] as the optimizer, where the learning rate and the batch size are set to 1e-3 and 256, respectively. All the experiments are performed on a NVIDIA RTX 3090 GPU, PyTorch 1.11.0 platform [5]. For the network ϕ of BCDUnit, we set the hidden size of Bi-LSTM to 256, and set the output dimensions of two layers MLP in Gate to 128 and 1. The scene context represents the map information around the target agent.