To address this need, we contribute Holistic Evaluation of Multimodal Models ( HEMM), visualized in Figure 1. HEMM, as an evaluation framework, goes beyond conventional lists of datasets to emphasize holistic benchmarking at three levels.
These challenges are partially due to a lack of structure or inductive bias in the neural networks typically used in learning the policy. One such form of structure that is commonly observed in multi-agent scenarios is symmetry.
We also introduce a novel benchmark for evaluating NNPs in molecular property prediction, Hamiltonian prediction, and conformational optimization tasks.
Addressing these gaps, we proposed Place3D, a full-cycle pipeline that encompasses Li-DAR placement optimization, data generation, and downstream evaluations.