Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

Neural Information Processing Systems 

Specifically, we design language prompts to describe all cases of event appearance for each video.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found