CL-HOI: Cross-Level Human-Object Interaction Distillation from Vision Large Language Models

Open in new window