Goto

Collaborating Authors

 contaminated sample


Partial identification of kernel based two sample tests with mismeasured data

Nafshi, Ron, Makar, Maggie

arXiv.org Artificial Intelligence

Nonparametric two-sample tests such as the Maximum Mean Discrepancy (MMD) are often used to detect differences between two distributions in machine learning applications. However, the majority of existing literature assumes that error-free samples from the two distributions of interest are available.We relax this assumption and study the estimation of the MMD under $\epsilon$-contamination, where a possibly non-random $\epsilon$ proportion of one distribution is erroneously grouped with the other. We show that under $\epsilon$-contamination, the typical estimate of the MMD is unreliable. Instead, we study partial identification of the MMD, and characterize sharp upper and lower bounds that contain the true, unknown MMD. We propose a method to estimate these bounds, and show that it gives estimates that converge to the sharpest possible bounds on the MMD as sample size increases, with a convergence rate that is faster than alternative approaches. Using three datasets, we empirically validate that our approach is superior to the alternatives: it gives tight bounds with a low false coverage rate.


Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination

Yu, Jongmin, Oh, Hyeontaek, Kim, Minkyung, Kim, Junsik

arXiv.org Artificial Intelligence

In this paper, we propose Normality-Calibrated Autoencoder (NCAE), which can boost anomaly detection performance on the contaminated datasets without any prior information or explicit abnormal samples in the training phase. The NCAE adversarially generates high confident normal samples from a latent space having low entropy and leverages them to predict abnormal samples in a training dataset. NCAE is trained to minimise reconstruction errors in uncontaminated samples and maximise reconstruction errors in contaminated samples. The experimental results demonstrate that our method outperforms shallow, hybrid, and deep methods for unsupervised anomaly detection and achieves comparable performance compared with semi-supervised methods using labelled anomaly samples in the training phase. The source code is publicly available on https://github.com/andreYoo/


Robust MIL-Based Feature Template Learning for Object Tracking

Lan, Xiangyuan (Hong Kong Baptist University) | Yuen, Pong C. (Hong Kong Baptist University) | Chellappa, Rama (University of Maryland, College Park)

AAAI Conferences

Because of appearance variations, training samples of the tracked targets collected by the online tracker are required for updating the tracking model. However, this often leads to tracking drift problem because of potentially corrupted samples: 1) contaminated/outlier samples resulting from large variations (e.g. occlusion, illumination), and 2) misaligned samples caused by tracking inaccuracy. Therefore, in order to reduce the tracking drift while maintaining the adaptability of a visual tracker, how to alleviate these two issues via an effective model learning (updating) strategy is a key problem to be solved. To address these issues, this paper proposes a novel and optimal model learning (updating) scheme which aims to simultaneously eliminate the negative effects from these two issues mentioned above in a unified robust feature template learning framework. Particularly, the proposed feature template learning framework is capable of: 1) adaptively learning uncontaminated feature templates by separating out contaminated samples, and 2) resolving label ambiguities caused by misaligned samples via a probabilistic multiple instance learning (MIL) model. Experiments on challenging video sequences show that the proposed tracker performs favourably against several state-of-the-art trackers.