Goto

Collaborating Authors

 Ma, Guixiang


A structure-aware framework for learning device placements on computation graphs

arXiv.org Artificial Intelligence

Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines.


Leveraging Reinforcement Learning and Large Language Models for Code Optimization

arXiv.org Artificial Intelligence

Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framework to decrease the complexity of code optimization. The proposed framework builds on large language models (LLMs) and reinforcement learning (RL) and enables LLMs to receive feedback from their environment (i.e., unit tests) during the fine-tuning process. We compare our framework with existing state-of-the-art models and show that it is more efficient with respect to speed and computational usage, as a result of the decrement in training steps and its applicability to models with fewer parameters. Additionally, our framework reduces the possibility of logical and syntactical errors. Toward evaluating our approach, we run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm. We adopt a variety of evaluation metrics with regards to optimization quality, and speedup. The evaluation results demonstrate that the proposed framework has similar results in comparison with existing models using shorter training times and smaller pre-trained models. In particular, we accomplish an increase of 5.6% and 2.2 over the baseline models concerning the %OP T and SP metrics.


Memory-Augmented Graph Neural Networks: A Brain-Inspired Review

arXiv.org Artificial Intelligence

We provide a comprehensive review of the existing literature on memory-augmented GNNs. We review these works through the lens of psychology and neuroscience, which has several established theories on how multiple memory systems and mechanisms operate in biological brains. We propose a taxonomy of memory-augmented GNNs and a set of criteria for comparing their memory mechanisms. We also provide critical discussions on the limitations of these works. Finally, we discuss the challenges and future directions for this area.


Provable Robust Saliency-based Explanations

arXiv.org Artificial Intelligence

Robust explanations of machine learning models are critical to establishing human trust in the models. The top-$k$ intersection is widely used to evaluate the robustness of explanations. However, most existing attacking and defense strategies are based on $\ell_p$ norms, thus creating a mismatch between the evaluation and optimization objectives. To this end, we define explanation thickness for measuring top-$k$ salient features ranking stability, and design the \textit{R2ET} algorithm based on a novel tractable surrogate to maximize the thickness and stabilize the top salient features efficiently. Theoretically, we prove a connection between R2ET and adversarial training; using a novel multi-objective optimization formulation and a generalization error bound, we further prove that the surrogate objective can improve both the numerical and statistical stability of the explanations. Experiments with a wide spectrum of network architectures and data modalities demonstrate that R2ET attains higher explanation robustness under stealthy attacks while retaining model accuracy.


Robust Ranking Explanations

arXiv.org Artificial Intelligence

Robust explanations of machine learning models are critical to establish human trust in the models. Due to limited cognition capability, most humans can only interpret the top few salient features. It is critical to make top salient features robust to adversarial attacks, especially those against the more vulnerable gradient-based explanations. Existing defense measures robustness using $\ell_p$-norms, which have weaker protection power. We define explanation thickness for measuring salient features ranking stability, and derive tractable surrogate bounds of the thickness to design the \textit{R2ET} algorithm to efficiently maximize the thickness and anchor top salient features. Theoretically, we prove a connection between R2ET and adversarial training. Experiments with a wide spectrum of network architectures and data modalities, including brain networks, demonstrate that R2ET attains higher explanation robustness under stealthy attacks while retaining accuracy.


Line Graph Contrastive Learning for Link Prediction

arXiv.org Artificial Intelligence

Link prediction tasks focus on predicting possible future connections. Most existing researches measure the likelihood of links by different similarity scores on node pairs and predict links between nodes. However, the similarity-based approaches have some challenges in information loss on nodes and generalization ability on similarity indexes. To address the above issues, we propose a Line Graph Contrastive Learning(LGCL) method to obtain rich information with multiple perspectives. LGCL obtains a subgraph view by h-hop subgraph sampling with target node pairs. After transforming the sampled subgraph into a line graph, the link prediction task is converted into a node classification task, which graph convolution progress can learn edge embeddings from graphs more effectively. Then we design a novel cross-scale contrastive learning framework on the line graph and the subgraph to maximize the mutual information of them, so that fuses the structure and feature information. The experimental results demonstrate that the proposed LGCL outperforms the state-of-the-art methods and has better performance on generalization and robustness.


Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model

arXiv.org Artificial Intelligence

Recently brain networks have been widely adopted to study brain dynamics, brain development and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. Firstly, most current graph learning models are designed for unsigned graph, which hinders the analysis of many signed network data (e.g., brain functional networks). Meanwhile, the insufficiency of brain network data limits the model performance on clinical phenotypes predictions. Moreover, few of current graph learning model is interpretable, which may not be capable to provide biological insights for model outcomes. Here, we propose an interpretable hierarchical signed graph representation learning model to extract graph-level representations from brain functional networks, which can be used for different prediction tasks. In order to further improve the model performance, we also propose a new strategy to augment functional brain network data for contrastive learning. We evaluate this framework on different classification and regression tasks using the data from HCP and OASIS. Our results from extensive experiments demonstrate the superiority of the proposed model compared to several state-of-the-art techniques. Additionally, we use graph saliency maps, derived from these prediction tasks, to demonstrate detection and interpretation of phenotypic biomarkers.


CommPOOL: An Interpretable Graph Pooling Framework for Hierarchical Graph Representation Learning

arXiv.org Artificial Intelligence

Recent years have witnessed the emergence and flourishing of hierarchical graph pooling neural networks (HGPNNs) which are effective graph representation learning approaches for graph level tasks such as graph classification. However, current HGPNNs do not take full advantage of the graph's intrinsic structures (e.g., community structure). Moreover, the pooling operations in existing HGPNNs are difficult to be interpreted. In this paper, we propose a new interpretable graph pooling framework - CommPOOL, that can capture and preserve the hierarchical community structure of graphs in the graph representation learning process. Specifically, the proposed community pooling mechanism in CommPOOL utilizes an unsupervised approach for capturing the inherent community structure of graphs in an interpretable manner. CommPOOL is a general and flexible framework for hierarchical graph representation learning that can further facilitate various graph-level tasks. Evaluations on five public benchmark datasets and one synthetic dataset demonstrate the superior performance of CommPOOL in graph representation learning for graph classification compared to the state-of-the-art baseline methods, and its effectiveness in capturing and preserving the community structure of graphs.


Adversarial Attack on Hierarchical Graph Pooling Neural Networks

arXiv.org Machine Learning

Recent years have witnessed the emergence and development of graph neural networks (GNNs), which have been shown as a powerful approach for graph representation learning in many tasks, such as node classification and graph classification. The research on the robustness of these models has also started to attract attentions in the machine learning field. However, most of the existing work in this area focus on the GNNs for node-level tasks, while little work has been done to study the robustness of the GNNs for the graph classification task. In this paper, we aim to explore the vulnerability of the Hierarchical Graph Pooling (HGP) Neural Networks, which are advanced GNNs that perform very well in the graph classification in terms of prediction accuracy. We propose an adversarial attack framework for this task. Specifically, we design a surrogate model that consists of convolutional and pooling operators to generate adversarial samples to fool the hierarchical GNN-based graph classification models. We set the preserved nodes by the pooling operator as our attack targets, and then we perturb the attack targets slightly to fool the pooling operator in hierarchical GNNs so that they will select the wrong nodes to preserve. We show the adversarial samples generated from multiple datasets by our surrogate model have enough transferability to attack current state-of-art graph classification models. Furthermore, we conduct the robust train on the target models and demonstrate that the retrained graph classification models are able to better defend against the attack from the adversarial samples. To the best of our knowledge, this is the first work on the adversarial attack against hierarchical GNN-based graph classification models.


Securing Behavior-based Opinion Spam Detection

arXiv.org Machine Learning

Reviews spams are prevalent in e-commerce to manipulate product ranking and customers decisions maliciously. While spams generated based on simple spamming strategy can be detected effectively, hardened spammers can evade regular detectors via more advanced spamming strategies. Previous work gave more attention to evasion against text and graph-based detectors, but evasions against behavior-based detectors are largely ignored, leading to vulnerabilities in spam detection systems. Since real evasion data are scarce, we first propose EMERAL (Evasion via Maximum Entropy and Rating sAmpLing) to generate evasive spams to certain existing detectors. EMERAL can simulate spammers with different goals and levels of knowledge about the detectors, targeting at different stages of the life cycle of target products. We show that in the evasion-defense dynamic, only a few evasion types are meaningful to the spammers, and any spammer will not be able to evade too many detection signals at the same time. We reveal that some evasions are quite insidious and can fail all detection signals. We then propose DETER (Defense via Evasion generaTion using EmeRal), based on model re-training on diverse evasive samples generated by EMERAL. Experiments confirm that DETER is more accurate in detecting both suspicious time window and individual spamming reviews. In terms of security, DETER is versatile enough to be vaccinated against diverse and unexpected evasions, is agnostic about evasion strategy and can be released without privacy concern.