Exploring Federated Self-Supervised Learning for General Purpose Audio Understanding

Rehman, Yasar Abbas Ur, Lau, Kin Wai, Xie, Yuyang, Ma, Lan, Shen, Jiajun

Feb-5-2024–arXiv.org Artificial Intelligence

The integration of Federated Learning (FL) and Self-supervised Learning (SSL) offers a unique and synergetic combination to exploit the audio data for general-purpose audio understanding, without compromising user data privacy. However, rare efforts have been made to investigate the SSL models in the FL regime for general-purpose audio understanding, especially when the training data is generated by large-scale heterogeneous audio sources. In this paper, we evaluate the performance of feature-matching and predictive audio-SSL techniques when integrated into large-scale FL settings simulated with non-independently identically distributed (non-iid) data. We propose a novel Federated SSL (F-SSL) framework, dubbed FASSL, that enables learning intermediate feature representations from large-scale decentralized heterogeneous clients, holding unlabelled audio data. Our study has found that audio F-SSL approaches perform on par with the centralized audio-SSL approaches on the audio-retrieval task. Extensive experiments demonstrate the effectiveness and significance of FASSL as it assists in obtaining the optimal global model for state-of-the-art FL aggregation methods.

downstream task, learning, representation, (14 more...)

arXiv.org Artificial Intelligence

Feb-5-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Latvia
  - Lubāna Municipality > Lubāna (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East > Israel
    - Tel Aviv District > Tel Aviv (0.04)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (0.88)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)