Goto

Collaborating Authors

 Chattopadhyay, Rita


A Two-Stage Weighting Framework for Multi-Source Domain Adaptation

Neural Information Processing Systems

Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution but may have plenty of labeled data from multiple related sources with different distributions. The difference in distributions may be in both marginal and conditional probabilities. Most of the existing domain adaptation work focuses on the marginal probability distribution difference between the domains, assuming that the conditional probabilities are similar. However in many real world applications, conditional probability distribution differences are as commonplace as marginal probability differences. In this paper we propose a two-stage domain adaptation methodology which combines weighted data from multiple sources based on marginal probability differences (first stage) as well as conditional probability differences (second stage), with the target domain data. The weights for minimizing the marginal probability differences are estimated independently, while the weights for minimizing conditional probability differences are computed simultaneously by exploiting the potential interaction among multiple sources. We also provide a theoretical analysis on the generalization performance of the proposed multi-source domain adaptation formulation using the weighted Rademacher complexity measure. Empirical comparisons with existing state-of-the-art domain adaptation methods using three real-world datasets demonstrate the effectiveness of the proposed approach.


Topology Preserving Domain Adaptation for Addressing Subject Based Variability in SEMG Signal

AAAI Conferences

A subject independent computational framework is one which does not require to be calibrated by the specific subject data to be ready to be used on the subject. The greatest challenge in developing such a framework is the variation in parameters across subjects which is termed as subject based variability. Spectral and amplitude variations in surface myoelectric signals (SEMG) are analyzed to determine the fatigue state of a muscle. But variations in the spectrum and magnitude of myoelectric signals across subjects cause variations in both marginal and conditional probability distributions in the features extracted across subjects, making it difficult to model the signal for any automated signal classification. However we observe that the manifold of the multidimensional SEMG data have an inherent similarity as the physiological state moves from no fatigue to fatigue state. In this paper we exploit this specific feature of the SEMG data and propose a domain adaptation technique that is based on intrinsic manifold of the data preserved in a low dimensional space, thus reducing the marginal probability differences between the subjects, followed by an instance selection methodology, based on similar conditional probabilities in the mapped domain. The proposed method provides significant improvement in subject independent accuracies compared to cases without any domain adaptation methods and also compared to other state-of-the-art domain adaptation methodologies.


Transfer Learning Framework for Early Detection of Fatigue Using Non-invasive Surface Electromyogram Signals (SEMG)

AAAI Conferences

The fundamental assumption being, any hypothesis found to approximate well over a sufficiently large Surface Electromyogram (SEMG) signals are physiological set of training examples will also approximate well over signals processed to assess the intensity of activity and the other unobserved examples (Mitchell 1997), belonging to fatigue state of the muscles, non-invasively (Kumar, Pah, the same distribution as the training data. But if this basic and Bradley 2003; Georgakis, Stergioulas, and Giakas 2003; assumption is violated as in the case of SEMG data over Koumantakis et al. 2001; Gerdle, Larsson, and Karlsson multiple subjects, direct application of traditional data mining 2000). However researches observed significant difference and machine learning methods would not work. Figure 1 between the data collected from different subjects shows a typical distribution of SEMG data for three different though they performed the same activity under similar experimental subjects, collected over a fatiguing exercise at varying speed conditions (Contessa, Adam, and Luca 2009; representing the four physiological phases corresponding to Gerdle, Larsson, and Karlsson 2000). Because of their four classes (l) low intensity of activity and low fatigue, (2) highly subject specific nature the SEMG based fatigue assessment high intensity of activity and moderate fatigue, (3) low intensity requires subject specific calibration and are hence of activity and moderate fatigue and (4) high intensity confined to clinical environments related to training and rehabilitation. of activity and high fatigue.