Not enough data to create a plot.
Try a different view from the menu above.
Wang, Nan
Few-shot Message-Enhanced Contrastive Learning for Graph Anomaly Detection
Xu, Fan, Wang, Nan, Wen, Xuezhi, Gao, Meiqi, Guo, Chaoqun, Zhao, Xibin
Graph anomaly detection plays a crucial role in identifying exceptional instances in graph data that deviate significantly from the majority. It has gained substantial attention in various domains of information security, including network intrusion, financial fraud, and malicious comments, et al. Existing methods are primarily developed in an unsupervised manner due to the challenge in obtaining labeled data. For lack of guidance from prior knowledge in unsupervised manner, the identified anomalies may prove to be data noise or individual data instances. In real-world scenarios, a limited batch of labeled anomalies can be captured, making it crucial to investigate the few-shot problem in graph anomaly detection. Taking advantage of this potential, we propose a novel few-shot Graph Anomaly Detection model called FMGAD (Few-shot Message-Enhanced Contrastive-based Graph Anomaly Detector). FMGAD leverages a self-supervised contrastive learning strategy within and across views to capture intrinsic and transferable structural representations. Furthermore, we propose the Deep-GNN message-enhanced reconstruction module, which extensively exploits the few-shot label information and enables long-range propagation to disseminate supervision signals to deeper unlabeled nodes. This module in turn assists in the training of self-supervised contrastive learning. Comprehensive experimental results on six real-world datasets demonstrate that FMGAD can achieve better performance than other state-of-the-art methods, regardless of artificially injected anomalies or domain-organic anomalies.
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation
Wang, Nan, Wang, Qifan, Wang, Yi-Chia, Sanjabi, Maziar, Liu, Jingzhou, Firooz, Hamed, Wang, Hongning, Nie, Shaoliang
As language models become increasingly integrated into our digital lives, Personalized Text Generation (PTG) has emerged as a pivotal component with a wide range of applications. However, the bias inherent in user written text, often used for PTG model training, can inadvertently associate different levels of linguistic quality with users' protected attributes. The model can inherit the bias and perpetuate inequality in generating text w.r.t. users' protected attributes, leading to unfair treatment when serving users. In this work, we investigate fairness of PTG in the context of personalized explanation generation for recommendations. We first discuss the biases in generated explanations and their fairness implications. To promote fairness, we introduce a general framework to achieve measure-specific counterfactual fairness in explanation generation. Extensive experiments and human evaluations demonstrate the effectiveness of our method.
BrainNPT: Pre-training of Transformer networks for brain network classification
Hu, Jinlong, Huang, Yangmin, Wang, Nan, Dong, Shoubin
Deep learning methods have advanced quickly in brain imaging analysis over the past few years, but they are usually restricted by the limited labeled data. Pre-trained model on unlabeled data has presented promising improvement in feature learning in many domains, including natural language processing and computer vision. However, this technique is under-explored in brain network analysis. In this paper, we focused on pre-training methods with Transformer networks to leverage existing unlabeled data for brain functional network classification. First, we proposed a Transformer-based neural network, named as BrainNPT, for brain functional network classification. The proposed method leveraged
Prompt Space Optimizing Few-shot Reasoning Success with Large Language Models
Shi, Fobo, Qing, Peijun, Yang, Dong, Wang, Nan, Lei, Youbo, Lu, Haonan, Lin, Xiaodong
Prompt engineering is an essential technique for enhancing the abilities of large language models (LLMs) by providing explicit and specific instructions. It enables LLMs to excel in various tasks, such as arithmetic reasoning, question answering, summarization, relation extraction, machine translation, and sentiment analysis. Researchers have been actively exploring different prompt engineering strategies, such as Chain of Thought (CoT), Zero-CoT, and In-context learning. However, an unresolved problem arises from the fact that current approaches lack a solid theoretical foundation for determining optimal prompts. To address this issue in prompt engineering, we propose a new and effective approach called Prompt Space. Our methodology utilizes text embeddings to obtain basis vectors by matrix decomposition, and then constructs a space for representing all prompts. Prompt Space significantly outperforms state-of-the-art prompt paradigms on ten public reasoning benchmarks. Notably, without the help of the CoT method and the prompt "Let's think step by step", Prompt Space shows superior performance over the few-shot method. Overall, our approach provides a robust and fundamental theoretical framework for selecting simple and effective prompts. This advancement marks a significant step towards improving prompt engineering for a wide variety of applications in LLMs.
Exploring Global and Local Information for Anomaly Detection with Normal Samples
Xu, Fan, Wang, Nan, Zhao, Xibin
Anomaly detection aims to detect data that do not conform to regular patterns, and such data is also called outliers. The anomalies to be detected are often tiny in proportion, containing crucial information, and are suitable for application scenes like intrusion detection, fraud detection, fault diagnosis, e-commerce platforms, et al. However, in many realistic scenarios, only the samples following normal behavior are observed, while we can hardly obtain any anomaly information. To address such problem, we propose an anomaly detection method GALDetector which is combined of global and local information based on observed normal samples. The proposed method can be divided into a three-stage method. Firstly, the global similar normal scores and the local sparsity scores of unlabeled samples are computed separately. Secondly, potential anomaly samples are separated from the unlabeled samples corresponding to these two scores and corresponding weights are assigned to the selected samples. Finally, a weighted anomaly detector is trained by loads of samples, then the detector is utilized to identify else anomalies. To evaluate the effectiveness of the proposed method, we conducted experiments on three categories of real-world datasets from diverse domains, and experimental results show that our method achieves better performance when compared with other state-of-the-art methods.
HySST: A Stable Sparse Rapidly-Exploring Random Trees Optimal Motion Planning Algorithm for Hybrid Dynamical Systems
Wang, Nan, Sanfelice, Ricardo G.
This paper proposes a stable sparse rapidly-exploring random trees (SST) algorithm to solve the optimal motion planning problem for hybrid systems. At each iteration, the proposed algorithm, called HySST, selects a vertex with the lowest cost among all the vertices within the neighborhood of a randomly selected sample and then extends the search tree by flow or jump, which is also chosen randomly when both regimes are possible. In addition, HySST maintains a static set of witness points such that all the vertices within the neighborhood of each witness are pruned except the vertex with the lowest cost. Through a definition of concatenation of functions defined on hybrid time domains, we show that HySST is asymptotically near optimal, namely, the probability of failing to find a motion plan such that its cost is close to the optimal cost approaches zero as the number of iterations of the algorithm increases to infinity. This property is guaranteed under mild conditions on the data defining the motion plan, which include a relaxation of the usual positive clearance assumption imposed in the literature of classical systems. The proposed algorithm is applied to an actuated bouncing ball system and a collision-resilient tensegrity multicopter system so as to highlight its generality and computational features.
TBDetector:Transformer-Based Detector for Advanced Persistent Threats with Provenance Graph
Wang, Nan, Wen, Xuezhi, Zhang, Dalin, Zhao, Xibin, Ma, Jiahui, Luo, Mengxia, Nie, Sen, Wu, Shi, Liu, Jiqiang
APT detection is difficult to detect due to the long-term latency, covert and slow multistage attack patterns of Advanced Persistent Threat (APT). To tackle these issues, we propose TBDetector, a transformer-based advanced persistent threat detection method for APT attack detection. Considering that provenance graphs provide rich historical information and have the powerful attacks historic correlation ability to identify anomalous activities, TBDetector employs provenance analysis for APT detection, which summarizes long-running system execution with space efficiency and utilizes transformer with self-attention based encoder-decoder to extract long-term contextual features of system states to detect slow-acting attacks. Furthermore, we further introduce anomaly scores to investigate the anomaly of different system states, where each state is calculated with an anomaly score corresponding to its similarity score and isolation score. To evaluate the effectiveness of the proposed method, we have conducted experiments on five public datasets, i.e., streamspot, cadets, shellshock, clearscope, and wget_baseline. Experimental results and comparisons with state-of-the-art methods have exhibited better performance of our proposed method.
A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP Algorithms on Electronic Health Records
Zhou, Sicheng, Wang, Nan, Wang, Liwei, Sun, Ju, Blaes, Anne, Liu, Hongfang, Zhang, Rui
Objective: The generalizability of clinical large language models is usually ignored during the model development process. This study evaluated the generalizability of BERT-based clinical NLP models across different clinical settings through a breast cancer phenotype extraction task. Materials and Methods: Two clinical corpora of breast cancer patients were collected from the electronic health records from the University of Minnesota and the Mayo Clinic, and annotated following the same guideline. We developed three types of NLP models (i.e., conditional random field, bi-directional long short-term memory and CancerBERT) to extract cancer phenotypes from clinical texts. The models were evaluated for their generalizability on different test sets with different learning strategies (model transfer vs. locally trained). The entity coverage score was assessed with their association with the model performances. Results: We manually annotated 200 and 161 clinical documents at UMN and MC, respectively. The corpora of the two institutes were found to have higher similarity between the target entities than the overall corpora. The CancerBERT models obtained the best performances among the independent test sets from two clinical institutes and the permutation test set. The CancerBERT model developed in one institute and further fine-tuned in another institute achieved reasonable performance compared to the model developed on local data (micro-F1: 0.925 vs 0.932). Conclusions: The results indicate the CancerBERT model has the best learning ability and generalizability among the three types of clinical NLP models. The generalizability of the models was found to be correlated with the similarity of the target entities between the corpora.
Clustered Data Sharing for Non-IID Federated Learning over Wireless Networks
Hu, Gang, Teng, Yinglei, Wang, Nan, Yu, F. Richard
Federated Learning (FL) is a novel distributed machine learning approach to leverage data from Internet of Things (IoT) devices while maintaining data privacy. However, the current FL algorithms face the challenges of non-independent and identically distributed (non-IID) data, which causes high communication costs and model accuracy declines. To address the statistical imbalances in FL, we propose a clustered data sharing framework which spares the partial data from cluster heads to credible associates through device-to-device (D2D) communication. Moreover, aiming at diluting the data skew on nodes, we formulate the joint clustering and data sharing problem based on the privacy-preserving constrained graph. To tackle the serious coupling of decisions on the graph, we devise a distribution-based adaptive clustering algorithm (DACA) basing on three deductive cluster-forming conditions, which ensures the maximum yield of data sharing. The experiments show that the proposed framework facilitates FL on non-IID datasets with better convergence and model accuracy under a limited communication environment.
GATE: Graph CCA for Temporal SElf-supervised Learning for Label-efficient fMRI Analysis
Peng, Liang, Wang, Nan, Xu, Jie, Zhu, Xiaofeng, Li, Xiaoxiao
In this work, we focus on the challenging task, neuro-disease classification, using functional magnetic resonance imaging (fMRI). In population graph-based disease analysis, graph convolutional neural networks (GCNs) have achieved remarkable success. However, these achievements are inseparable from abundant labeled data and sensitive to spurious signals. To improve fMRI representation learning and classification under a label-efficient setting, we propose a novel and theory-driven self-supervised learning (SSL) framework on GCNs, namely Graph CCA for Temporal self-supervised learning on fMRI analysis GATE. Concretely, it is demanding to design a suitable and effective SSL strategy to extract formation and robust features for fMRI. To this end, we investigate several new graph augmentation strategies from fMRI dynamic functional connectives (FC) for SSL training. Further, we leverage canonical-correlation analysis (CCA) on different temporal embeddings and present the theoretical implications. Consequently, this yields a novel two-step GCN learning procedure comprised of (i) SSL on an unlabeled fMRI population graph and (ii) fine-tuning on a small labeled fMRI dataset for a classification task. Our method is tested on two independent fMRI datasets, demonstrating superior performance on autism and dementia diagnosis.