Kang, Jian
Understanding and Rectifying Safety Perception Distortion in VLMs
Zou, Xiaohan, Kang, Jian, Kesidis, George, Lin, Lu
Recent studies reveal that vision-language models (VLMs) become more susceptible to harmful requests and jailbreak attacks after integrating the vision modality, exhibiting greater vulnerability than their text-only LLM backbones. To uncover the root cause of this phenomenon, we conduct an in-depth analysis and identify a key issue: multimodal inputs introduce an modality-induced activation shift toward a "safer" direction compared to their text-only counterparts, leading VLMs to systematically overestimate the safety of harmful inputs. We refer to this issue as safety perception distortion. To mitigate such distortion, we propose Activation Shift Disentanglement and Calibration (ShiftDC), a training-free method that decomposes and calibrates the modality-induced activation shift to reduce the impact of modality on safety. By isolating and removing the safety-relevant component, ShiftDC restores the inherent safety alignment of the LLM backbone while preserving the vision-language capabilities of VLMs. Empirical results demonstrate that ShiftDC significantly enhances alignment performance on safety benchmarks without impairing model utility.
PageRank Bandits for Link Prediction
Ban, Yikun, Zou, Jiaru, Li, Zihao, Qi, Yunzhe, Fu, Dongqi, Kang, Jian, Tong, Hanghang, He, Jingrui
Link prediction is a critical problem in graph learning with broad applications such as recommender systems and knowledge graph completion. Numerous research efforts have been directed at solving this problem, including approaches based on similarity metrics and Graph Neural Networks (GNN). However, most existing solutions are still rooted in conventional supervised learning, which makes it challenging to adapt over time to changing customer interests and to address the inherent dilemma of exploitation versus exploration in link prediction. To tackle these challenges, this paper reformulates link prediction as a sequential decision-making process, where each link prediction interaction occurs sequentially. We propose a novel fusion algorithm, PRB (PageRank Bandits), which is the first to combine contextual bandits with PageRank for collaborative exploitation and exploration. We also introduce a new reward formulation and provide a theoretical performance guarantee for PRB. Finally, we extensively evaluate PRB in both online and offline settings, comparing it with bandit-based and graph-based methods. The empirical success of PRB demonstrates the value of the proposed fusion approach. Our code is released at https://github.com/jiaruzouu/PRB.
Adaptive Bayesian Multivariate Spline Knot Inference with Prior Specifications on Model Complexity
He, Junhui, Yang, Ying, Kang, Jian
In multivariate spline regression, the number and locations of knots influence the performance and interpretability significantly. However, due to non-differentiability and varying dimensions, there is no desirable frequentist method to make inference on knots. In this article, we propose a fully Bayesian approach for knot inference in multivariate spline regression. The existing Bayesian method often uses BIC to calculate the posterior, but BIC is too liberal and it will heavily overestimate the knot number when the candidate model space is large. We specify a new prior on the knot number to take into account the complexity of the model space and derive an analytic formula in the normal model. In the non-normal cases, we utilize the extended Bayesian information criterion to approximate the posterior density. The samples are simulated in the space with differing dimensions via reversible jump Markov chain Monte Carlo. We apply the proposed method in knot inference and manifold denoising. Experiments demonstrate the splendid capability of the algorithm, especially in function fitting with jumping discontinuity.
Theoretical and Empirical Insights into the Origins of Degree Bias in Graph Neural Networks
Subramonian, Arjun, Kang, Jian, Sun, Yizhou
Graph Neural Networks (GNNs) often perform better for high-degree nodes than low-degree nodes on node classification tasks. This degree bias can reinforce social marginalization by, e.g., sidelining authors of lowly-cited papers when predicting paper topics in citation networks. While researchers have proposed numerous hypotheses for why GNN degree bias occurs, we find via a survey of 38 degree bias papers that these hypotheses are often not rigorously validated, and can even be contradictory. Thus, we provide an analysis of the origins of degree bias in message-passing GNNs with different graph filters. We prove that high-degree test nodes tend to have a lower probability of misclassification regardless of how GNNs are trained. Moreover, we show that degree bias arises from a variety of factors that are associated with a node's degree (e.g., homophily of neighbors, diversity of neighbors). Furthermore, we show that during training, some GNNs may adjust their loss on low-degree nodes more slowly than on high-degree nodes; however, with sufficiently many epochs of training, message-passing GNNs can achieve their maximum possible training accuracy, which is not significantly limited by their expressive power. Throughout our analysis, we connect our findings to previously-proposed hypotheses for the origins of degree bias, supporting and unifying some while drawing doubt to others. We validate our theoretical findings on 8 common real-world networks, and based on our theoretical and empirical insights, describe a roadmap to alleviate degree bias.
On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method
Cong, Weilin, Kang, Jian, Tong, Hanghang, Mahdavi, Mehrdad
Temporal graph learning (TGL) has emerged as an important machine learning problem and is widely used in a number of real-world applications, such as traffic prediction [Yuan and Li, 2021, Zhang et al., 2021], knowledge graphs [Cai et al., 2022, Leblay and Chekol, 2018], and recommender systems [Kumar et al., 2019, Rossi et al., 2020, Xu et al., 2020a]. A typical downstream task of temporal graph learning is link prediction, which focuses on predicting future interactions among nodes. For example in an online video recommender system, the user-video clicks can be modeled as a temporal graph whose nodes represent users and videos, and links are associated with timestamps indicating when users click videos. Link prediction between nodes can be used to predict if and when a user is interested in a video. Therefore, designing graph learning models that can capture node evolutionary patterns and accurately predict future links is important. TGL is generally more challenging than static graph learning, thereby requiring more sophisticated algorithms to model the temporal evolutionary patterns [Huang et al., 2023]. In recent years, many TGL algorithms [Kumar et al., 2019, Xu et al., 2020a, Rossi et al., 2020, Sankar et al., 2020, Wang et al., 2021e] have been proposed that leverage memory blocks, self-attention, time-encoding function, recurrent neural networks (RNNs), temporal walks, and message passing to better capture the meaningful structural or temporal patterns. For instance, JODIE [Kumar et al., 2019] maintains a memory block for each node and utilizes an RNN to update the memory blocks upon the occurance of each interaction; TGAT [Xu et al., 2020a] utilizes self-attention message passing to aggregate neighbor information on the temporal graph; TGN [Rossi et al., 2020] combines memory blocks with message passing to allow each node in the temporal graph to have a receptive field that is not limited by the number of message-passing layers; DySAT [Sankar et al., 2020] uses self-attention to capture structural information and uses RNN to capture temporal dependencies; CAW [Wang et al., 2021e] captures temporal dependencies between nodes by performing multiple temporal walks from the root
Deceptive Fairness Attacks on Graphs via Meta Learning
Kang, Jian, Xia, Yinglong, Maciejewski, Ross, Luo, Jiebo, Tong, Hanghang
We study deceptive fairness attacks on graphs to answer the following question: How can we achieve poisoning attacks on a graph learning model to exacerbate the bias deceptively? We answer this question via a bi-level optimization problem and propose a meta learning-based framework named FATE. FATE is broadly applicable with respect to various fairness definitions and graph learning models, as well as arbitrary choices of manipulation operations. We further instantiate FATE to attack statistical parity and individual fairness on graph neural networks. We conduct extensive experimental evaluations on real-world datasets in the task of semi-supervised node classification. The experimental results demonstrate that FATE could amplify the bias of graph neural networks with or without fairness consideration while maintaining the utility on the downstream task. We hope this paper provides insights into the adversarial robustness of fair graph learning and can shed light on designing robust and fair graph learning in future studies.
Penalized Deep Partially Linear Cox Models with Application to CT Scans of Lung Cancer Patients
Sun, Yuming, Kang, Jian, Haridas, Chinmay, Mayne, Nicholas R., Potter, Alexandra L., Yang, Chi-Fu Jeffrey, Christiani, David C., Li, Yi
Lung cancer is a leading cause of cancer mortality globally, highlighting the importance of understanding its mortality risks to design effective patient-centered therapies. The National Lung Screening Trial (NLST) employed computed tomography texture analysis, which provides objective measurements of texture patterns on CT scans, to quantify the mortality risks of lung cancer patients. Partially linear Cox models have gained popularity for survival analysis by dissecting the hazard function into parametric and nonparametric components, allowing for the effective incorporation of both well-established risk factors (such as age and clinical variables) and emerging risk factors (e.g., image features) within a unified framework. However, when the dimension of parametric components exceeds the sample size, the task of model fitting becomes formidable, while nonparametric modeling grapples with the curse of dimensionality. We propose a novel Penalized Deep Partially Linear Cox Model (Penalized DPLC), which incorporates the SCAD penalty to select important texture features and employs a deep neural network to estimate the nonparametric component of the model. We prove the convergence and asymptotic properties of the estimator and compare it to other methods through extensive simulation studies, evaluating its performance in risk prediction and feature selection. The proposed method is applied to the NLST study dataset to uncover the effects of key clinical and imaging risk factors on patients' survival. Our findings provide valuable insights into the relationship between these factors and survival outcomes.
Ensuring User-side Fairness in Dynamic Recommender Systems
Yoo, Hyunsik, Zeng, Zhichen, Kang, Jian, Liu, Zhining, Zhou, David, Wang, Fei, Chan, Eunice, Tong, Hanghang
User-side group fairness is crucial for modern recommender systems, as it aims to alleviate performance disparity between groups of users defined by sensitive attributes such as gender, race, or age. We find that the disparity tends to persist or even increase over time. This calls for effective ways to address user-side fairness in a dynamic environment, which has been infrequently explored in the literature. However, fairness-constrained re-ranking, a typical method to ensure user-side fairness (i.e., reducing performance disparity), faces two fundamental challenges in the dynamic setting: (1) non-differentiability of the ranking-based fairness constraint, which hinders the end-to-end training paradigm, and (2) time-inefficiency, which impedes quick adaptation to changes in user preferences. In this paper, we propose FAir Dynamic rEcommender (FADE), an end-to-end framework with fine-tuning strategy to dynamically alleviate performance disparity. To tackle the above challenges, FADE uses a novel fairness loss designed to be differentiable and lightweight to fine-tune model parameters to ensure both user-side fairness and high-quality recommendations. Via extensive experiments on the real-world dataset, we empirically demonstrate that FADE effectively and efficiently reduces performance disparity, and furthermore, FADE improves overall recommendation quality over time compared to not using any new data.
BeMap: Balanced Message Passing for Fair Graph Neural Network
Lin, Xiao, Kang, Jian, Cong, Weilin, Tong, Hanghang
Graph Neural Network (GNN) has shown strong empirical performance in many downstream tasks by iteratively aggregating information from the local neighborhood of each node, i.e., message passing. However, concrete evidence has revealed that a graph neural network could be biased against certain demographic groups, which calls for the consideration of algorithmic fairness. Despite the increasing efforts in ensuring algorithmic fairness on graph neural networks, they often do not explicitly consider the induced bias caused by message passing in GNN during training. In this paper, we first investigate the problem of bias amplification in message passing. We empirically and theoretically demonstrate that message passing could amplify the bias when the 1-hop neighbors from different demographic groups are unbalanced. Guided by such analyses, we propose BeMap, a fair message passing method, that leverages a balance-aware sampling strategy to balance the number of the 1-hop neighbors of each node among different demographic groups. Extensive experiments on node classification demonstrate the efficacy of our proposed BeMap method in mitigating bias while maintaining classification accuracy.
Sequential Best-Arm Identification with Application to Brain-Computer Interface
Zhou, Xin, Hao, Botao, Kang, Jian, Lattimore, Tor, Li, Lexin
A brain-computer interface (BCI) is a technology that enables direct communication between the brain and an external device or computer system. It allows individuals to interact with the device using only their thoughts, and holds immense potential for a wide range of applications in medicine, rehabilitation, and human augmentation. An electroencephalogram (EEG) and event-related potential (ERP)- based speller system is a type of BCI that allows users to spell words without using a physical keyboard, but instead by recording and interpreting brain signals under different stimulus presentation paradigms. Conventional non-adaptive paradigms treat each word selection independently, leading to a lengthy learning process. To improve the sampling efficiency, we cast the problem as a sequence of best-arm identification tasks in multi-armed bandits. Leveraging pre-trained large language models (LLMs), we utilize the prior knowledge learned from previous tasks to inform and facilitate subsequent tasks. To do so in a coherent way, we propose a sequential top-two Thompson sampling (STTS) algorithm under the fixed-confidence setting and the fixed-budget setting. We study the theoretical property of the proposed algorithm, and demonstrate its substantial empirical improvement through both synthetic data analysis as well as a P300 BCI speller simulator example.