1 Supplement 1.1 Model Architectures
–Neural Information Processing Systems
Figure 1: Model Architectures for Latent Integration Using a latent vector of dimension k, our multiplicative model is able to learn k interpretations of the observation, which are each modulated by a dimension of the latent vector. A skip connection allows the model to learn policies faster than without. As a baseline, we use a concatenation model, in which the latent vector z is concatenated with the environment observation at each timestep. In both cases, by setting corresponding model weights to zero, a learned policy could completely ignore the latent vector to yield a standard RL policy architecture. In practice, since k and d are small (k = 3 and d {16, 32, 64}) in our experiments, the increase in computational cost is not significant.
Neural Information Processing Systems
May-28-2025, 12:12:00 GMT
- Technology: