onneuralfield
SupplementaryMaterialfor3DConceptGrounding onNeuralFields
To enable communication between points at lower layers, we also add pooling and expansion layers between the ResNet-blocks. The encoder is a bidirectional LSTM [1]. The decoder is asimilar LSTM that generates avector from the previous token ofthe output sequence. In general, the whole training process is split into 3 stages.