Appendix information on the relationship between our training approach and domain adaptation

Apr-24-2026, 21:30:31 GMT–Neural Information Processing Systems

Here we note our problem definition of pre-training is fundamentally different from domain adaptation [S1, S2, S3, S4, S5, S6]1 in order to prevent any confusion between this work and domain adaptation methods. DA applies a model trained on a pre-training dataset (i.e., source dataset) to a different target dataset [21, 42]. In contrast, self-supervised pre-training has four key differences with domain adaptation. In contrast, domain adaptation methods usually restrict pre-training and target datasets to have the same feature space (but possible different distributions), e.g., [S22, S18, S19, S20, S13]. In summary, to support transfer learning across different time series datasets, a pre-training approach needs a capability to capture a generalizable property of time series, one that is shared across different time series datasets regardless of the specific semantic meaning of a time series signal (e.g., ECG, EMG, acceleration, vibration), conditions of data acquisition (e.g., variation across subjects and devices), sampling frequencies, etc. This work develops a self-supervised contrastive pre-training strategy that fulfills these requirements by injecting an appropriate inductive bias (called Time-Frequency Consistency, TF-C, into the model (Sec. Further, we clarify that the term'self-supervised' has different meanings in DA and in pretraining [S23, S24, S25, S26]. The'self-supervised domain adaptation' [S27, S16, S21, S15] or'unsupervised domain adaptation' [S1, S22, S28, S11, S14] means that there are no labels in the target dataset, however that still requires labels in the pre-training dataset. In contrast, 'self-supervised pretraining' [S29, S30, S31] (i.e., the problem studied here, in line with a breadth of existing literature on pre-training) indicates the setting where no labels are available in pre-training. Up to the submission of this manuscript, there is no existing contrastive augmentations in time series' frequency domain. There are two models, CoST [49] and BTSF [50], that involved frequency domain in contrastive learning, however, the proposed TF-C is fundamentally different with them in the following aspects. We take BTSF as an example while the differences also apply to CoST. Problem definitions for both papers are different. Our method is designed to produce generalizable representations that can transfer to a different time series dataset (going from pre-training to a fine-tuning dataset) for the purpose of transfer learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Apr-24-2026, 21:30:31 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.67)

Industry:
- Energy (0.67)
- Health & Medicine > Therapeutic Area
  - Cardiology/Vascular Diseases (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
194b8dac525581c346e30a2cebe9a369-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found