Xu, Hongyi
Large Language Models for Extrapolative Modeling of Manufacturing Processes
Khanghah, Kiarash Naghavi, Patel, Anandkumar, Malhotra, Rajiv, Xu, Hongyi
Conventional predictive modeling of parametric relationships in manufacturing processes is limited by the subjectivity of human expertise and intuition on the one hand and by the cost and time of experimental data generation on the other hand. This work addresses this issue by establishing a new Large Language Model (LLM) framework. The novelty lies in combining automatic extraction of process-relevant knowledge embedded in the literature with iterative model refinement based on a small amount of experimental data. This approach is evaluated on three distinct manufacturing processes that are based on machining, deformation, and additive principles. The results show that for the same small experimental data budget the models derived by our framework have unexpectedly high extrapolative performance, often surpassing the capabilities of conventional Machine Learning. Further, our approach eliminates manual generation of initial models or expertise-dependent interpretation of the literature. The results also reveal the importance of the nature of the knowledge extracted from the literature and the significance of both the knowledge extraction and model refinement components.
Hypergraph-based multi-scale spatio-temporal graph convolution network for Time-Series anomaly detection
Xu, Hongyi
Multivariate time series anomaly detection technology plays an important role in many fields including aerospace, water treatment, cloud service providers, etc. Excellent anomaly detection models can greatly improve work efficiency and avoid major economic losses. However, with the development of technology, the increasing size and complexity of data, and the lack of labels for relevant abnormal data, it is becoming increasingly challenging to perform effective and accurate anomaly detection in high-dimensional and complex data sets. In this paper, we propose a hypergraph based spatiotemporal graph convolutional neural network model STGCN_Hyper, which explicitly captures high-order, multi-hop correlations between multiple variables through a hypergraph based dynamic graph structure learning module. On this basis, we further use the hypergraph based spatiotemporal graph convolutional network to utilize the learned hypergraph structure to effectively propagate and aggregate one-hop and multi-hop related node information in the convolutional network, thereby obtaining rich spatial information. Furthermore, through the multi-scale TCN dilated convolution module, the STGCN_hyper model can also capture the dependencies of features at different scales in the temporal dimension. An unsupervised anomaly detector based on PCA and GMM is also integrated into the STGCN_hyper model. Through the anomaly score of the detector, the model can detect the anomalies in an unsupervised way. Experimental results on multiple time series datasets show that our model can flexibly learn the multi-scale time series features in the data and the dependencies between features, and outperforms most existing baseline models in terms of precision, recall, F1-score on anomaly detection tasks. Our code is available on: https://git.ecdf.ed.ac.uk/msc-23-24/s2044819
X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention
Xie, You, Xu, Hongyi, Song, Guoxian, Wang, Chao, Shi, Yichun, Luo, Linjie
We propose X-Portrait, an innovative conditional diffusion model tailored for generating expressive and temporally coherent portrait animation. Specifically, given a single portrait as appearance reference, we aim to animate it with motion derived from a driving video, capturing both highly dynamic and subtle facial expressions along with wide-range head movements. As its core, we leverage the generative prior of a pre-trained diffusion model as the rendering backbone, while achieve fine-grained head pose and expression control with novel controlling signals within the framework of ControlNet. In contrast to conventional coarse explicit controls such as facial landmarks, our motion control module is learned to interpret the dynamics directly from the original driving RGB inputs. The motion accuracy is further enhanced with a patch-based local control module that effectively enhance the motion attention to small-scale nuances like eyeball positions. Notably, to mitigate the identity leakage from the driving signals, we train our motion control modules with scaling-augmented cross-identity images, ensuring maximized disentanglement from the appearance reference modules. Experimental results demonstrate the universal effectiveness of X-Portrait across a diverse range of facial portraits and expressive driving sequences, and showcase its proficiency in generating captivating portrait animations with consistently maintained identity characteristics.
An interpretable deep learning method for bearing fault diagnosis
Lu, Hao, Bray, Austin M., Hu, Chao, Zimmerman, Andrew T., Xu, Hongyi
Deep learning (DL) has gained popularity in recent years as an effective tool for classifying the current health and predicting the future of industrial equipment. However, most DL models have black-box components with an underlying structure that is too complex to be interpreted and explained to human users. This presents significant challenges when deploying these models for safety-critical maintenance tasks, where non-technical personnel often need to have complete trust in the recommendations these models give. To address these challenges, we utilize a convolutional neural network (CNN) with Gradient-weighted Class Activation Mapping (Grad-CAM) activation map visualizations to form an interpretable DL method for classifying bearing faults. After the model training process, we apply Grad-CAM to identify a training sample's feature importance and to form a library of diagnosis knowledge (or health library) containing training samples with annotated feature maps. During the model evaluation process, the proposed approach retrieves prediction basis samples from the health library according to the similarity of the feature importance. The proposed method can be easily applied to any CNN model without modifying the model architecture, and our experimental results show that this method can select prediction basis samples that are intuitively and physically meaningful, improving the model's trustworthiness for human users.