Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning

Open in new window