VisualConceptsTokenization Appendix
–Neural Information Processing Systems
This is quite similar to what VCT can learn on the synthesized dataset Objects-Room. As the real-world dataset is more diverse, we observe several failure cases shown in Figure 8. We suppose those failure cases are due to VCT, trained withreconstruction loss,isnotgoodatsynthesizing counterfactual samples which arefarfromthe data distribution.
Neural Information Processing Systems
Feb-11-2026, 23:12:38 GMT
- Technology: