Goto

Collaborating Authors

 Feng, Cheng


Latent Diffusion Model-Enabled Real-Time Semantic Communication Considering Semantic Ambiguities and Channel Noises

arXiv.org Artificial Intelligence

Semantic communication (SemCom) has emerged as a new paradigm for 6G communication, with deep learning (DL) models being one of the key drives to shift from the accuracy of bit/symbol to the semantics and pragmatics of data. Nevertheless, DL-based SemCom systems often face performance bottlenecks due to overfitting, poor generalization, and sensitivity to outliers. Furthermore, the varying-fading gains and noises with uncertain signal-to-noise ratios (SNRs) commonly present in wireless channels usually restrict the accuracy of semantic information transmission. Consequently, this paper constructs a latent diffusion model-enabled SemCom system, and proposes three improvements compared to existing works: i) To handle potential outliers in the source data, semantic errors obtained by projected gradient descent based on the vulnerabilities of DL models, are utilized to update the parameters and obtain an outlier-robust encoder. ii) A lightweight single-layer latent space transformation adapter completes one-shot learning at the transmitter and is placed before the decoder at the receiver, enabling adaptation for out-of-distribution data and enhancing human-perceptual quality. iii) An end-to-end consistency distillation (EECD) strategy is used to distill the diffusion models trained in latent space, enabling deterministic single or few-step real-time denoising in various noisy channels while maintaining high semantic quality. Extensive numerical experiments across different datasets demonstrate the superiority of the proposed SemCom system, consistently proving its robustness to outliers, the capability to transmit data with unknown distributions, and the ability to perform real-time channel denoising tasks while preserving high human perceptual quality, outperforming the existing denoising approaches in semantic metrics.


A cGAN Ensemble-based Uncertainty-aware Surrogate Model for Offline Model-based Optimization in Industrial Control Problems

arXiv.org Artificial Intelligence

This study focuses on two important problems related to applying offline model-based optimization to real-world industrial control problems. The first problem is how to create a reliable probabilistic model that accurately captures the dynamics present in noisy industrial data. The second problem is how to reliably optimize control parameters without actively collecting feedback from industrial systems. Specifically, we introduce a novel cGAN ensemble-based uncertainty-aware surrogate model for reliable offline model-based optimization in industrial control problems. The effectiveness of the proposed method is demonstrated through extensive experiments conducted on two representative cases, namely a discrete control case and a continuous control case. The results of these experiments show that our method outperforms several competitive baselines in the field of offline model-based optimization for industrial control.


Only the Curve Shape Matters: Training Foundation Models for Zero-Shot Multivariate Time Series Forecasting through Next Curve Shape Prediction

arXiv.org Artificial Intelligence

We present General Time Transformer (GTT), an encoder-only style foundation model for zero-shot multivariate time series forecasting. GTT is pretrained on a large dataset of 200M high-quality time series samples spanning diverse domains. In our proposed framework, the task of multivariate time series forecasting is formulated as a channel-wise next curve shape prediction problem, where each time series sample is represented as a sequence of non-overlapping curve shapes with a unified numerical magnitude. GTT is trained to predict the next curve shape based on a window of past curve shapes in a channel-wise manner. Experimental results demonstrate that GTT exhibits superior zero-shot multivariate forecasting capabilities on unseen time series datasets, even surpassing state-of-the-art supervised baselines. Additionally, we investigate the impact of varying GTT model parameters and training dataset scales, observing that the scaling law also holds in the context of zero-shot multivariate time series forecasting.


PARs: Predicate-based Association Rules for Efficient and Accurate Model-Agnostic Anomaly Explanation

arXiv.org Artificial Intelligence

Our user study shows that the anomaly explanation form of PARs is better understood and favoured by Anomaly detection, which aims to identify data instances regular anomaly detection system users compared with existing that do not conform to the expected behavior, is a classic model-agnostic anomaly explanation options. In our machine learning task with numerous applications in experiments, we demonstrate that it is significantly more various domains including fraud detection, intrusion detection, efficient to find PARs than anchors (Ribeiro, Singh, and predictive maintenance, etc. Over the past decades, numerous Guestrin 2018), another rule-based explanation, for identified methods have been proposed to tackle this challenging anomaly instances. Moreover, PARs are also far more problem. Examples include one-class classificationbased accurate than anchors for anomaly explanation, meaning (Manevitz and Yousef 2001; Ruff et al. 2018), nearest that they have considerably higher precision and recall when neighbor-based (Breunig et al. 2000), clustering-based applied as anomaly detection rules on unseen data other (Jiang and An 2008), isolation-based (Liu, Ting, and Zhou than the anomaly instance on which they were originally derived 2012; Hariri, Kind, and Brunner 2019), density-based (Liu, for explanation. Additionally, we show that PARs can Tan, and Zhou 2022; Feng and Tian 2021) and deep anomaly also achieve higher accuracy on abnormal feature identification detection models based on autoencoders (Zhou and Paffenroth compared with many state-of-the-art model-agnostic 2017; Zong et al. 2018), generative adversarial networks explanation methods including LIME (Ribeiro, Singh, and (Zenati et al. 2018; Han, Chen, and Liu 2021), to Guestrin 2016), SHAP (Lundberg and Lee 2017), COIN name a few.


Learning Invariant Rules from Data for Interpretable Anomaly Detection

arXiv.org Artificial Intelligence

In the research area of anomaly detection, novel and promising methods are frequently developed. However, most existing studies exclusively focus on the detection task only and ignore the interpretability of the underlying models as well as their detection results. Nevertheless, anomaly interpretation, which aims to provide explanation of why specific data instances are identified as anomalies, is an equally important task in many real-world applications. In this work, we propose a novel framework which synergizes several machine learning and data mining techniques to automatically learn invariant rules that are consistently satisfied in a given dataset. The learned invariant rules can provide explicit explanation of anomaly detection results in the inference phase and thus are extremely useful for subsequent decision-making regarding reported anomalies. Furthermore, our empirical evaluation shows that the proposed method can also achieve comparable or even better performance in terms of AUC and partial AUC on public benchmark datasets across various application domains compared with start-of-the-art anomaly detection models.


StackVAE-G: An efficient and interpretable model for time series anomaly detection

arXiv.org Artificial Intelligence

Recent studies have shown that autoencoder-based models can achieve superior performance on anomaly detection tasks due to their excellent ability to fit complex data in an unsupervised manner. In this work, we propose a novel autoencoder-based model, named StackVAE-G that can significantly bring the efficiency and interpretability to multivariate time series anomaly detection. Specifically, we utilize the similarities across the time series channels by the stacking block-wise reconstruction with a weight-sharing scheme to reduce the size of learned models and also relieve the overfitting to unknown noises in the training data. We also leverage a graph learning module to learn a sparse adjacency matrix to explicitly capture the stable interrelation structure among multiple time series channels for the interpretable pattern reconstruction of interrelated channels. Combining these two modules, we introduce the stacking block-wise VAE (variational autoencoder) with GNN (graph neural network) model for multivariate time series anomaly detection. We conduct extensive experiments on three commonly used public datasets, showing that our model achieves comparable (even better) performance with the state-of-the-art modelsand meanwhile requires much less computation and memory cost. Furthermore, we demonstrate that the adjacency matrix learned by our model accurately captures the interrelation among multiple channels, and can provide valuable information for failure diagnosis applications.


Time Series Anomaly Detection for Cyber-physical Systems via Neural System Identification and Bayesian Filtering

arXiv.org Machine Learning

Recent advances in AIoT technologies have led to an increasing popularity of utilizing machine learning algorithms to detect operational failures for cyber-physical systems (CPS). In its basic form, an anomaly detection module monitors the sensor measurements and actuator states from the physical plant, and detects anomalies in these measurements to identify abnormal operation status. Nevertheless, building effective anomaly detection models for CPS is rather challenging as the model has to accurately detect anomalies in presence of highly complicated system dynamics and unknown amount of sensor noise. In this work, we propose a novel time series anomaly detection method called Neural System Identification and Bayesian Filtering (NSIBF) in which a specially crafted neural network architecture is posed for system identification, i.e., capturing the dynamics of CPS in a dynamical state-space model; then a Bayesian filtering algorithm is naturally applied on top of the "identified" state-space model for robust anomaly detection by tracking the uncertainty of the hidden state of the system recursively over time. We provide qualitative as well as quantitative experiments with the proposed method on a synthetic and three real-world CPS datasets, showing that NSIBF compares favorably to the state-of-the-art methods with considerable improvements on anomaly detection in CPS.


Nonlinear Hawkes Processes in Time-Varying System

arXiv.org Machine Learning

Hawkes processes are a class of point processes that have the ability to model the self- and mutual-exciting phenomena. Although the classic Hawkes processes cover a wide range of applications, their expressive ability is limited due to three key hypotheses: parametric, linear and homogeneous. Recent work has attempted to address these limitations separately. This work aims to overcome all three assumptions simultaneously by proposing the flexible state-switching Hawkes processes: a flexible, nonlinear and nonhomogeneous variant where a state process is incorporated to interact with the point processes. The proposed model empowers Hawkes processes to be applied to time-varying systems. For inference, we utilize the latent variable augmentation technique to design two efficient Bayesian inference algorithms: Gibbs sampler and mean-field variational inference, with analytical iterative updates to estimate the posterior. In experiments, our model achieves superior performance compared to the state-of-the-art competitors.


Effective Sample Pair Generation for Ultrasound Video Contrastive Representation Learning

arXiv.org Artificial Intelligence

Most deep neural networks (DNNs) based ultrasound (US) medical image analysis models use pretrained backbones (e.g., ImageNet) for better model generalization. However, the domain gap between natural and medical images causes an inevitable performance bottleneck when applying to US image analysis. Our idea is to pretrain DNNs on US images directly to avoid this bottleneck. Due to the lack of annotated large-scale datasets of US images, we first construct a new large-scale US video-based image dataset named US-4, containing over 23,000 high-resolution images from four US video sub-datasets, where two sub-datasets are newly collected by our local experienced doctors. To make full use of this dataset, we then innovatively propose an US semi-supervised contrastive learning (USCL) method to effectively learn feature representations of US images, with a new sample pair generation (SPG) scheme to tackle the problem that US images extracted from videos have high similarities. Moreover, the USCL treats contrastive loss as a consistent regularization, which boosts the performance of pretrained backbones by combining the supervised loss in a mutually reinforcing way. Extensive experiments on down-stream tasks' fine-tuning show the superiority of our approach against ImageNet pretraining and pretraining using previous state-of-the-art semi-supervised learning approaches. In particular, our pretrained backbone gets fine-tuning accuracy of over 94%, which is 9% higher than 85% of the ImageNet pretrained model on the widely used POCUS dataset. The constructed US-4 dataset and source codes of this work will be made public.


RelSen: An Optimization-based Framework for Simultaneously Sensor Reliability Monitoring and Data Cleaning

arXiv.org Artificial Intelligence

Recent advances in the Internet of Things (IoT) technology have led to a surge on the popularity of sensing applications. As a result, people increasingly rely on information obtained from sensors to make decisions in their daily life. Unfortunately, in most sensing applications, sensors are known to be error-prone and their measurements can become misleading at any unexpected time. Therefore, in order to enhance the reliability of sensing applications, apart from the physical phenomena/processes of interest, we believe it is also highly important to monitor the reliability of sensors and clean the sensor data before analysis on them being conducted. Existing studies often regard sensor reliability monitoring and sensor data cleaning as separate problems. In this work, we propose RelSen, a novel optimization-based framework to address the two problems simultaneously via utilizing the mutual dependence between them. Furthermore, RelSen is not application-specific as its implementation assumes a minimal prior knowledge of the process dynamics under monitoring. This significantly improves its generality and applicability in practice. In our experiments, we apply RelSen on an outdoor air pollution monitoring system and a condition monitoring system for a cement rotary kiln. Experimental results show that our framework can timely identify unreliable sensors and remove sensor measurement errors caused by three types of most commonly observed sensor faults.