TriBERT: Human-centric Audio-visual Representation Learning

Open in new window