A Experimental Details

Aug-15-2025, 07:54:35 GMT–Neural Information Processing Systems

We dynamically batch model calls onto the GPU in order to increase inference speed. For OBL, there are dependencies between policy and belief training. The entire inference and training infrastructure for a single policy or belief model uses a machine with 30 CPU cores and 2 GPUs, one GPU for training and one for simulation. We use their public-lstm architecture design. We use a 3-layer feedforward neural network to encode the entire private observation.

agent, belief model, replay buffer, (15 more...)

Neural Information Processing Systems

Aug-15-2025, 07:54:35 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
631f99d8e860054410c239fc90d18270-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found