Yu, Qi
Joint Optimal Transport and Embedding for Network Alignment
Yu, Qi, Zeng, Zhichen, Yan, Yuchen, Ying, Lei, Srikant, R., Tong, Hanghang
Network alignment, which aims to find node correspondence across different networks, is the cornerstone of various downstream multi-network and Web mining tasks. Most of the embedding-based methods indirectly model cross-network node relationships by contrasting positive and negative node pairs sampled from hand-crafted strategies, which are vulnerable to graph noises and lead to potential misalignment of nodes. Another line of work based on the optimal transport (OT) theory directly models cross-network node relationships and generates noise-reduced alignments. However, OT methods heavily rely on fixed, pre-defined cost functions that prohibit end-to-end training and are hard to generalize. In this paper, we aim to unify the embedding and OT-based methods in a mutually beneficial manner and propose a joint optimal transport and embedding framework for network alignment named JOENA. For one thing (OT for embedding), through a simple yet effective transformation, the noise-reduced OT mapping serves as an adaptive sampling strategy directly modeling all cross-network node pairs for robust embedding learning.For another (embedding for OT), on top of the learned embeddings, the OT cost can be gradually trained in an end-to-end fashion, which further enhances the alignment quality. With a unified objective, the mutual benefits of both methods can be achieved by an alternating optimization schema with guaranteed convergence. Extensive experiments on real-world networks validate the effectiveness and scalability of JOENA, achieving up to 16% improvement in MRR and 20x speedup compared with the state-of-the-art alignment methods.
Reinforced Compressive Neural Architecture Search for Versatile Adversarial Robustness
Wang, Dingrong, Sapkota, Hitesh, Tao, Zhiqiang, Yu, Qi
Prior neural architecture search (NAS) for adversarial robustness works have discovered that a lightweight and adversarially robust neural network architecture could exist in a non-robust large teacher network, generally disclosed by heuristic rules through statistical analysis and neural architecture search, generally disclosed by heuristic rules from neural architecture search. However, heuristic methods cannot uniformly handle different adversarial attacks and "teacher" network capacity. To solve this challenge, we propose a Reinforced Compressive Neural Architecture Search (RC-NAS) for Versatile Adversarial Robustness. Specifically, we define task settings that compose datasets, adversarial attacks, and teacher network information. Given diverse tasks, we conduct a novel dual-level training paradigm that consists of a meta-training and a fine-tuning phase to effectively expose the RL agent to diverse attack scenarios (in meta-training), and making it adapt quickly to locate a sub-network (in fine-tuning) for any previously unseen scenarios. Experiments show that our framework could achieve adaptive compression towards different initial teacher networks, datasets, and adversarial attacks, resulting in more lightweight and adversarially robust architectures.
On Model Explanations with Transferable Neural Pathways
Lin, Xinmiao, Bao, Wentao, Yu, Qi, Kong, Yu
Neural pathways as model explanations consist of a sparse set of neurons that provide the same level of prediction performance as the whole model. Existing methods primarily focus on accuracy and sparsity but the generated pathways may offer limited interpretability thus fall short in explaining the model behavior. In this paper, we suggest two interpretability criteria of neural pathways: (i) same-class neural pathways should primarily consist of class-relevant neurons; (ii) each instance's neural pathway sparsity should be optimally determined. To this end, we propose a Generative Class-relevant Neural Pathway (GEN-CNP) model that learns to predict the neural pathways from the target model's feature maps. We propose to learn class-relevant information from features of deep and shallow layers such that same-class neural pathways exhibit high similarity. We further impose a faithfulness criterion for GEN-CNP to generate pathways with instance-specific sparsity. We propose to transfer the class-relevant neural pathways to explain samples of the same class and show experimentally and qualitatively their faithfulness and interpretability.
Learn to Accumulate Evidence from All Training Samples: Theory and Practice
Pandey, Deep, Yu, Qi
Evidential deep learning, built upon belief theory and subjective logic, offers a principled and computationally efficient way to turn a deterministic neural network uncertainty-aware. The resultant evidential models can quantify fine-grained uncertainty using the learned evidence. To ensure theoretically sound evidential models, the evidence needs to be non-negative, which requires special activation functions for model training and inference. This constraint often leads to inferior predictive performance compared to standard softmax models, making it challenging to extend them to many large-scale datasets. To unveil the real cause of this undesired behavior, we theoretically investigate evidential models and identify a fundamental limitation that explains the inferior performance: existing evidential activation functions create zero evidence regions, which prevent the model to learn from training samples falling into such regions. A deeper analysis of evidential activation functions based on our theoretical underpinning inspires the design of a novel regularizer that effectively alleviates this fundamental limitation. Extensive experiments over many challenging real-world datasets and settings confirm our theoretical findings and demonstrate the effectiveness of our proposed approach.
Scaling Up Dynamic Graph Representation Learning via Spiking Neural Networks
Li, Jintang, Yu, Zhouxin, Zhu, Zulun, Chen, Liang, Yu, Qi, Zheng, Zibin, Tian, Sheng, Wu, Ruofan, Meng, Changhua
Recent years have seen a surge in research on dynamic graph representation learning, which aims to model temporal graphs that are dynamic and evolving constantly over time. However, current work typically models graph dynamics with recurrent neural networks (RNNs), making them suffer seriously from computation and memory overheads on large temporal graphs. So far, scalability of dynamic graph representation learning on large temporal graphs remains one of the major challenges. In this paper, we present a scalable framework, namely SpikeNet, to efficiently capture the temporal and structural patterns of temporal graphs. We explore a new direction in that we can capture the evolving dynamics of temporal graphs with spiking neural networks (SNNs) instead of RNNs. As a low-power alternative to RNNs, SNNs explicitly model graph dynamics as spike trains of neuron populations and enable spike-based propagation in an efficient way. Experiments on three large real-world temporal graph datasets demonstrate that SpikeNet outperforms strong baselines on the temporal node classification task with lower computational costs. Particularly, SpikeNet generalizes to a large temporal graph (2.7M nodes and 13.9M edges) with significantly fewer parameters and computation overheads.Our code is publicly available at \url{https://github.com/EdisonLeeeee/SpikeNet}.
Evidential Conditional Neural Processes
Pandey, Deep Shankar, Yu, Qi
The Conditional Neural Process (CNP) family of models offer a promising direction to tackle few-shot problems by achieving better scalability and competitive predictive performance. However, the current CNP models only capture the overall uncertainty for the prediction made on a target data point. They lack a systematic fine-grained quantification on the distinct sources of uncertainty that are essential for model training and decision-making under the few-shot setting. We propose Evidential Conditional Neural Processes (ECNP), which replace the standard Gaussian distribution used by CNP with a much richer hierarchical Bayesian structure through evidential learning to achieve epistemic-aleatoric uncertainty decomposition. The evidential hierarchical structure also leads to a theoretically justified robustness over noisy training tasks. Theoretical analysis on the proposed ECNP establishes the relationship with CNP while offering deeper insights on the roles of the evidential parameters. Extensive experiments conducted on both synthetic and real-world data demonstrate the effectiveness of our proposed model in various few-shot settings.
Uncertainty-Aware Multiple Instance Learning from Large-Scale Long Time Series Data
Zhu, Yuansheng, Shi, Weishi, Pandey, Deep Shankar, Liu, Yang, Que, Xiaofan, Krutz, Daniel E., Yu, Qi
We propose a novel framework to classify large-scale time series data with long duration. Long time seriesclassification (L-TSC) is a challenging problem because the dataoften contains a large amount of irrelevant information to theclassification target. The irrelevant period degrades the classifica-tion performance while the relevance is unknown to the system.This paper proposes an uncertainty-aware multiple instancelearning (MIL) framework to identify the most relevant periodautomatically. The predictive uncertainty enables designing anattention mechanism that forces the MIL model to learn from thepossibly discriminant period. Moreover, the predicted uncertaintyyields a principled estimator to identify whether a prediction istrustworthy or not. We further incorporate another modality toaccommodate unreliable predictions by training a separate modelbased on its availability and conduct uncertainty aware fusion toproduce the final prediction. Systematic evaluation is conductedon the Automatic Identification System (AIS) data, which is col-lected to identify and track real-world vessels. Empirical resultsdemonstrate that the proposed method can effectively detect thetypes of vessels based on the trajectory and the uncertainty-awarefusion with other available data modality (Synthetic-ApertureRadar or SAR imagery is used in our experiments) can furtherimprove the detection accuracy.
A Feasible Nonconvex Relaxation Approach to Feature Selection
Gao, Cuixia (Zhejiang University) | Wang, Naiyan (Zhejiang University) | Yu, Qi (Zhejiang University) | Zhang, Zhihua (Zhejiang University)
Variable selection problems are typically addressed under apenalized optimization framework. Nonconvex penalties such as the minimax concave plus (MCP) and smoothly clipped absolute deviation(SCAD), have been demonstrated to have the properties of sparsity practically and theoretically. In this paper we propose a new nonconvex penalty that we call exponential-type penalty. The exponential-type penalty is characterized by a positive parameter,which establishes a connection with the ell 0 and ell 1 penalties.We apply this new penalty to sparse supervised learning problems. To solve to resulting optimization problem, we resort to a reweighted ell 1 minimization method. Moreover, we devise an efficient method for the adaptive update of the tuning parameter. Our experimental results are encouraging. They show that the exponential-type penalty is competitive with MCP and SCAD.