Lee, Ivan
FairGP: A Scalable and Fair Graph Transformer Using Graph Partitioning
Luo, Renqiang, Huang, Huafei, Lee, Ivan, Xu, Chengpei, Qi, Jianzhong, Xia, Feng
Recent studies have highlighted significant fairness issues in Graph Transformer (GT) models, particularly against subgroups defined by sensitive features. Additionally, GTs are computationally intensive and memory-demanding, limiting their application to large-scale graphs. Our experiments demonstrate that graph partitioning can enhance the fairness of GT models while reducing computational complexity. To understand this improvement, we conducted a theoretical investigation into the root causes of fairness issues in GT models. We found that the sensitive features of higher-order nodes disproportionately influence lower-order nodes, resulting in sensitive feature bias. We propose Fairness-aware scalable GT based on Graph Partitioning (FairGP), which partitions the graph to minimize the negative impact of higher-order nodes. By optimizing attention mechanisms, FairGP mitigates the bias introduced by global attention, thereby enhancing fairness. Extensive empirical evaluations on six real-world datasets validate the superior performance of FairGP in achieving fairness compared to state-of-the-art methods. The codes are available at https://github.com/LuoRenqiang/FairGP.
Collaborative Team Recognition: A Core Plus Extension Structure
Yu, Shuo, Alqahtani, Fayez, Tolba, Amr, Lee, Ivan, Jia, Tao, Xia, Feng
Scientific collaboration is a significant behavior in knowledge creation and idea exchange. To tackle large and complex research questions, a trend of team formation has been observed in recent decades. In this study, we focus on recognizing collaborative teams and exploring inner patterns using scholarly big graph data. We propose a collaborative team recognition (CORE) model with a "core + extension" team structure to recognize collaborative teams in large academic networks. In CORE, we combine an effective evaluation index called the collaboration intensity index with a series of structural features to recognize collaborative teams in which members are in close collaboration relationships. Then, CORE is used to guide the core team members to their extension members. CORE can also serve as the foundation for team-based research. The simulation results indicate that CORE reveals inner patterns of scientific collaboration: senior scholars have broad collaborative relationships and fixed collaboration patterns, which are the underlying mechanisms of team assembly. The experimental results demonstrate that CORE is promising compared with state-of-the-art methods.
Exploring the Relationship Between Model Architecture and In-Context Learning Ability
Lee, Ivan, Jiang, Nan, Berg-Kirkpatrick, Taylor
What is the relationship between model architecture and the ability to perform in-context learning? In this empirical study, we take the first steps toward answering this question. We evaluate twelve model architectures capable of causal language modeling across a suite of synthetic in-context learning tasks. These selected architectures represent a broad range of paradigms, including recurrent and convolution-based neural networks, transformers, state-space model inspired, and other emerging attention alternatives. We discover that all the considered architectures can perform in-context learning under a wider range of conditions than previously documented. Additionally, we observe stark differences in statistical efficiency and consistency by varying context length and task difficulty. We also measure each architecture's predisposition towards in-context learning when presented with alternative routes for task resolution. Finally, and somewhat surprisingly, we find that several attention alternatives are more robust in-context learners than transformers. Given that such approaches have constant-sized memory footprints at inference time, this result opens the possibility of scaling up in-context learning to accommodate vastly larger numbers of in-context examples. In-context learning (ICL) refers to the ability to learn new tasks at inference time, using only inputoutput pair exemplars as guidance. Radford et al. (2019) demonstrate early signs of this ability in GPT-2, a causal transformer (Vaswani et al., 2017). ICL was further popularized by GPT-3 (Brown et al., 2020), a large language model with the same architectural foundation but augmented with greater capacity and trained on large-scale data. By simply adjusting a natural language prompt, it was shown that GPT-3 could adapt to new tasks, such as translation and question answering, without updating any of its parameters.
Explainable Knowledge Distillation for On-device Chest X-Ray Classification
Termritthikun, Chakkrit, Umer, Ayaz, Suwanwimolkul, Suwichaya, Xia, Feng, Lee, Ivan
Automated multi-label chest X-rays (CXR) image classification has achieved substantial progress in clinical diagnosis via utilizing sophisticated deep learning approaches. However, most deep models have high computational demands, which makes them less feasible for compact devices with low computational requirements. To overcome this problem, we propose a knowledge distillation (KD) strategy to create the compact deep learning model for the real-time multi-label CXR image classification. We study different alternatives of CNNs and Transforms as the teacher to distill the knowledge to a smaller student. Then, we employed explainable artificial intelligence (XAI) to provide the visual explanation for the model decision improved by the KD. Our results on three benchmark CXR datasets show that our KD strategy provides the improved performance on the compact student model, thus being the feasible choice for many limited hardware platforms. For instance, when using DenseNet161 as the teacher network, EEEA-Net-C2 achieved an AUC of 83.7%, 87.1%, and 88.7% on the ChestX-ray14, CheXpert, and PadChest datasets, respectively, with fewer parameters of 4.7 million and computational cost of 0.3 billion FLOPS.
Heterogeneous Graph Learning for Explainable Recommendation over Academic Networks
Chen, Xiangtai, Tang, Tao, Ren, Jing, Lee, Ivan, Chen, Honglong, Xia, Feng
With the explosive growth of new graduates with research degrees every year, unprecedented challenges arise for early-career researchers to find a job at a suitable institution. This study aims to understand the behavior of academic job transition and hence recommend suitable institutions for PhD graduates. Specifically, we design a deep learning model to predict the career move of early-career researchers and provide suggestions. The design is built on top of scholarly/academic networks, which contains abundant information about scientific collaboration among scholars and institutions. We construct a heterogeneous scholarly network to facilitate the exploring of the behavior of career moves and the recommendation of institutions for scholars. We devise an unsupervised learning model called HAI (Heterogeneous graph Attention InfoMax) which aggregates attention mechanism and mutual information for institution recommendation. Moreover, we propose scholar attention and meta-path attention to discover the hidden relationships between several meta-paths. With these mechanisms, HAI provides ordered recommendations with explainability. We evaluate HAI upon a real-world dataset against baseline methods. Experimental results verify the effectiveness and efficiency of our approach.
BRBA: A Blocking-Based Association Rule Hiding Method
Cheng, Peng (Southwest University and Harbin Institute of Technology) | Lee, Ivan (University of South Australia) | Li, Li (Southwest University) | Tseng, Kuo-Kun (Harbin Institute of Technology) | Pan, Jeng-Shyang (Harbin Institute of Technology)
Privacy preserving in association mining is an important research topic in the database security field. This paper has proposed a blocking-based method to solve the association rule hiding problem for data sharing. It aims at reducing undesirable side effects and increasing desirable side effects, while ensuring to conceal all sensitive rules. The candidate transactions are selected for sanitization based on their relations with border rules. Comparative experiments on real datasets demonstrate that the proposed method can achieve its goals.