ACode
–Neural Information Processing Systems
We use the same hyperparameters as in large scale curiosity: a learning rate of 0.0001 for all models, a discount factorγ of 0.99, and 3 optimization epochsperrollout. Here we present results on using audio in baselines, as described in the main paper ablations section. In the first baseline, the prediction space is concatenated audio and visual features: the intrinsic model takes an audio-visual feature vector as input and predicts an audio-visual feature vector as output. The results from the audio-visual prediction baseline are shown in Figure 9. In the second baseline, we add audio to randomnetworkdistillation[35].
Neural Information Processing Systems
Feb-9-2026, 18:41:49 GMT
- Technology: