Keep what you need : extracting efficient subnetworks from large audio representation models