Recent works proposed amortizing the cost by learning generalized wave functions across different structures and compounds instead of solving each problem independently.
Overall, our method can generate high-fidelity, diverse, and multi-view consistent meshes from single-view wild images within 30 seconds, as shown in Figure 1. We conduct extensive experiments on various wild 2D images with different styles.
Our findings on NoRa dataset reveal a prevalent vulnerability to such noise among current LLMs, with existing robust methods like self-correction and self-consistency showing limited efficacy.