Do self-supervised speech and language models extract similar representations as human brain?

Open in new window