subgraph
Interpretable Prototype-based Graph Information Bottleneck
The success of Graph Neural Networks (GNNs) has led to a need for understanding their decision-making process and providing explanations for their predictions, which has given rise to explainable AI (XAI) that offers transparent explanations for black-box models. Recently, the use of prototypes has successfully improved the explainability of models by learning prototypes to imply training graphs that affect the prediction. However, these approaches tend to provide prototypes with excessive information from the entire graph, leading to the exclusion of key substructures or the inclusion of irrelevant substructures, which can limit both the interpretability and the performance of the model in downstream tasks. In this work, we propose a novel framework of explainable GNNs, called interpretable Prototype-based Graph Information Bottleneck (PGIB), that incorporates prototype learning within the information bottleneck framework to provide prototypes with the key subgraph from the input graph that is important for the model prediction. This is the first work that incorporates prototype learning into the process of identifying the key subgraphs that have a critical impact on the prediction performance. Extensive experiments, including qualitative analysis, demonstrate that PGIB outperforms state-of-the-art methods in terms of both prediction performance and explainability.
AutoGO: Automated Computation Graph Optimization for Neural Network Evolution
Optimizing Deep Neural Networks (DNNs) to obtain high-quality models for efficient real-world deployment has posed multi-faceted challenges to machine learning engineers. Existing methods either search for neural architectures in heuristic design spaces or apply low-level adjustments to computation primitives to improve inference efficiency on hardware. We present Automated Graph Optimization (AutoGO), a framework to evolve neural networks in a low-level Computation Graph (CG) of primitive operations to improve both its performance and hardware friendliness. Through a tokenization scheme, AutoGO performs variable-sized segment mutations, making both primitive changes and larger-grained changes to CGs. We introduce our segmentation and mutation algorithms, efficient frequent segment mining technique, as well as a pretrained context-aware predictor to estimate the impact of segment replacements. Extensive experimental results show that AutoGO can automatically evolve several typical large convolutional networks to achieve significant task performance improvement and FLOPs reduction on a range of CV tasks, ranging from Classification, Semantic Segmentation, Human Pose Estimation, to Super Resolution, yet without introducing any newer primitive operations. We also demonstrate the lightweight deployment results of AutoGOoptimized super-resolution and denoising U-Nets on a cycle simulator for a Neural Processing Unit (NPU), achieving PSNR improvement and latency/power reduction simultaneously.
Evaluating Post-hoc Explanations for Graph Neural Networks via Robustness Analysis
This work studies the evaluation of explaining graph neural networks (GNNs), which is crucial to the credibility of post-hoc explainability in practical usage. Conventional evaluation metrics, and even explanation methods -- which mainly follow the paradigm of feeding the explanatory subgraph to the model and measuring output difference -- mostly suffer from the notorious out-of-distribution (OOD) issue. Hence, in this work, we endeavor to confront this issue by introducing a novel evaluation metric, termed OOD-resistant Adversarial Robustness (OAR). Specifically, we draw inspiration from adversarial robustness and evaluate post-hoc explanation subgraphs by calculating their robustness under attack. On top of that, an elaborate OOD reweighting block is inserted into the pipeline to confine the evaluation process to the original data distribution. For applications involving large datasets, we further devise a Simplified version of OAR (SimOAR), which achieves a significant improvement in computational efficiency at the cost of a small amount of performance.
e21a7b668ce3ea2c9c964c52d1c9f161-Supplemental-Conference.pdf
Invariant graph representation learning aims to learn the invariance among data from different environments for out-of-distribution generalization on graphs. As the graph environment partitions are usually expensive to obtain, augmenting the environment information has become the de facto approach. However, the usefulness of the augmented environment information has never been verified. In this work, we find that it is fundamentally impossible to learn invariant graph representations via environment augmentation without additional assumptions. Therefore, we develop a set of minimal assumptions, including variation sufficiency and variation consistency, for feasible invariant graph learning.
e21a7b668ce3ea2c9c964c52d1c9f161-Paper-Conference.pdf
Invariant graph representation learning aims to learn the invariance among data from different environments for out-of-distribution generalization on graphs. As the graph environment partitions are usually expensive to obtain, augmenting the environment information has become the de facto approach. However, the usefulness of the augmented environment information has never been verified. In this work, we find that it is fundamentally impossible to learn invariant graph representations via environment augmentation without additional assumptions. Therefore, we develop a set of minimal assumptions, including variation sufficiency and variation consistency, for feasible invariant graph learning.
Faster approximate subgraph counts with privacy
One of the most common problems studied in the context of differential privacy for graph data is counting the number of non-induced embeddings of a subgraph in a given graph. These counts have very high global sensitivity. Therefore, adding noise based on powerful alternative techniques, such as smooth sensitivity and higher-order local sensitivity have been shown to give significantly better accuracy. However, all these alternatives to global sensitivity become computationally very expensive, and to date efficient polynomial time algorithms are known only for few selected subgraphs, such as triangles, k-triangles, and k-stars. In this paper, we show that good approximations to these sensitivity metrics can be still used to get private algorithms. Using this approach, we much faster algorithms for privately counting the number of triangles in real-world social networks, which can be easily parallelized. We also give a private polynomial time algorithm for counting any constant size subgraph using less noise than the global sensitivity; we show this can be improved significantly for counting paths in special classes of graphs.
Appendix ARemovable Variables
In this section, we first prove the proposed graphical representation for a removable variable in a MAGM (Theorem 1). Then, we discuss how this representation reduces to Theorem 5 of [11] in the case of DAGs. Throughout our proofs, we say a path between X and Y is blocked by a set Wif it is not m-connecting relative to W. In this case, there exists a non-collider W on the path which is a member of W, or there exists a collider W on the path such that W/2 Anc({X,Y }[ W). In both cases we say W blocks this path with respect to W, or W blocks the path in short when W is clear from the context. We say X is a descendant of Y if Y 2Anc(X), and we denote by DeM(X) the set of descendants of X in the MAGM, and De(X) whenever the graph is clear from the context. A.1 Graphical representation Theorem 1. Vertex X is removable in a MAGM over the variables V, if and only if 1. for any Y 2Adj(X) and Z 2Ch(X)[N(X)\{Y}, Y and Z are adjacent, and 2. for any collider path u =( X,V1,...,V m,Y) and Z 2 V\{X,Y,V1,...,V m} such that {X,V1,...,V m} Pa(Z), Y and Z are adjacent. Let H denote the induced subgraph of M over V\{X}. For any W V\{X,Y,Z}, (Z,X,Y) is an m-connecting path relative to W in M, as X is a non-collider and X/2W. That is, no such W can m-separate Y and Z. Since X is removable in M, by definition of removability, (Y?Z|W)M ()(Y?Z|W)H. Again for any W V\{X,Y,Z}, (Z,X,V1,...,V m,Y) is an m-connecting path relative to W in M since I) every collider on this path is a parent (and therefore an ancestor) of Z, and II) X/2W and X is the only non-collider on this path. That is, no such W can m-separate Y and Z. Since X is removable in M, Equation 8 implies that Y and Z have no m-separating sets in H. Hence, Y is adjacent to Z in H, and therefore, in M. if part: We need to prove that for any Y,Z 2V\{X} and any W V\{X,Y,Z}, (Y?Z|W)M ()(Y?Z|W)H.