Chen, Ling
A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions
Na, Hongbin, Hua, Yining, Wang, Zimu, Shen, Tao, Yu, Beibei, Wang, Lilin, Wang, Wei, Torous, John, Chen, Ling
Mental health remains a critical global challenge, with increasing demand for accessible, effective interventions. Large language models (LLMs) offer promising solutions in psychotherapy by enhancing the assessment, diagnosis, and treatment of mental health conditions through dynamic, context-aware interactions. This survey provides a comprehensive overview of the current landscape of LLM applications in psychotherapy, highlighting the roles of LLMs in symptom detection, severity estimation, cognitive assessment, and therapeutic interventions. We present a novel conceptual taxonomy to organize the psychotherapy process into three core components: assessment, diagnosis, and treatment, and examine the challenges and advancements in each area. The survey also addresses key research gaps, including linguistic biases, limited disorder coverage, and underrepresented therapeutic models. Finally, we discuss future directions to integrate LLMs into a holistic, end-to-end psychotherapy framework, addressing the evolving nature of mental health conditions and fostering more inclusive, personalized care.
AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection
Qiao, Hezhe, Niu, Chaoxi, Chen, Ling, Pang, Guansong
Graph anomaly detection (GAD) aims to identify abnormal nodes that differ from the majority of the nodes in a graph, which has been attracting significant attention in recent years. Existing generalist graph models have achieved remarkable success in different graph tasks but struggle to generalize to the GAD task. This limitation arises from their difficulty in learning generalized knowledge for capturing the inherently infrequent, irregular and heterogeneous abnormality patterns in graphs from different domains. To address this challenge, we propose AnomalyGFM, a GAD-oriented graph foundation model that supports zero-shot inference and few-shot prompt tuning for GAD in diverse graph datasets. One key insight is that graph-agnostic representations for normal and abnormal classes are required to support effective zero/few-shot GAD across different graphs. Motivated by this, AnomalyGFM is pre-trained to align data-independent, learnable normal and abnormal class prototypes with node representation residuals (i.e., representation deviation of a node from its neighbors). The residual features essentially project the node information into a unified feature space where we can effectively measure the abnormality of nodes from different graphs in a consistent way. This provides a driving force for the learning of graph-agnostic, discriminative prototypes for the normal and abnormal classes, which can be used to enable zero-shot GAD on new graphs, including very large-scale graphs. If there are few-shot labeled normal nodes available in the new graphs, AnomalyGFM can further support prompt tuning to leverage these nodes for better adaptation. Comprehensive experiments on 11 widely-used GAD datasets with real anomalies, demonstrate that AnomalyGFM significantly outperforms state-of-the-art competing methods under both zero- and few-shot GAD settings.
MTPChat: A Multimodal Time-Aware Persona Dataset for Conversational Agents
Yang, Wanqi, Li, Yanda, Fang, Meng, Chen, Ling
Understanding temporal dynamics is critical for conversational agents, enabling effective content analysis and informed decision-making. However, time-aware datasets, particularly for persona-grounded conversations, are still limited, which narrows their scope and diminishes their complexity. To address this gap, we introduce MTPChat, a multimodal, time-aware persona dialogue dataset that integrates linguistic, visual, and temporal elements within dialogue and persona memory. Leveraging MTPChat, we propose two time-sensitive tasks: Temporal Next Response Prediction (TNRP) and Temporal Grounding Memory Prediction (TGMP), both designed to assess a model's ability to understand implicit temporal cues and dynamic interactions. Additionally, we present an innovative framework featuring an adaptive temporal module to effectively integrate multimodal streams and capture temporal dependencies. Experimental results validate the challenges posed by MTPChat and demonstrate the effectiveness of our framework in multimodal time-sensitive scenarios.
AdaF^2M^2: Comprehensive Learning and Responsive Leveraging Features in Recommendation System
Zhu, Yongchun, Chen, Jingwu, Chen, Ling, Li, Yitan, Zhang, Feng, Yang, Xiao, Liu, Zuotao
Feature modeling, which involves feature representation learning and leveraging, plays an essential role in industrial recommendation systems. However, the data distribution in real-world applications usually follows a highly skewed long-tail pattern due to the popularity bias, which easily leads to over-reliance on ID-based features, such as user/item IDs and ID sequences of interactions. Such over-reliance makes it hard for models to learn features comprehensively, especially for those non-ID meta features, e.g., user/item characteristics. Further, it limits the feature leveraging ability in models, getting less generalized and more susceptible to data noise. Previous studies on feature modeling focus on feature extraction and interaction, hardly noticing the problems brought about by the long-tail data distribution. To achieve better feature representation learning and leveraging on real-world data, we propose a model-agnostic framework AdaF^2M^2, short for Adaptive Feature Modeling with Feature Mask. The feature-mask mechanism helps comprehensive feature learning via multi-forward training with augmented samples, while the adapter applies adaptive weights on features responsive to different user/item states. By arming base models with AdaF^2M^2, we conduct online A/B tests on multiple recommendation scenarios, obtaining +1.37% and +1.89% cumulative improvements on user active days and app duration respectively. Besides, the extended offline experiments on different models show improvements as well. AdaF$^2$M$^2$ has been widely deployed on both retrieval and ranking tasks in multiple applications of Douyin Group, indicating its superior effectiveness and universality.
Generalized Grade-of-Membership Estimation for High-dimensional Locally Dependent Data
Chen, Ling, Huang, Chengzhu, Gu, Yuqi
This work focuses on the mixed membership models for multivariate categorical data widely used for analyzing survey responses and population genetics data. These grade of membership (GoM) models offer rich modeling power but present significant estimation challenges for high-dimensional polytomous data. Popular existing approaches, such as Bayesian MCMC inference, are not scalable and lack theoretical guarantees in high-dimensional settings. To address this, we first observe that data from this model can be reformulated as a three-way (quasi-)tensor, with many subjects responding to many items with varying numbers of categories. We introduce a novel and simple approach that flattens the three-way quasi-tensor into a "fat" matrix, and then perform a singular value decomposition of it to estimate parameters by exploiting the singular subspace geometry. Our fast spectral method can accommodate a broad range of data distributions with arbitrarily locally dependent noise, which we formalize as the generalized-GoM models. We establish finite-sample entrywise error bounds for the generalized-GoM model parameters. This is supported by a new sharp two-to-infinity singular subspace perturbation theory for locally dependent and flexibly distributed noise, a contribution of independent interest. Simulations and applications to data in political surveys, population genetics, and single-cell sequencing demonstrate our method's superior performance.
Semi-supervised Credit Card Fraud Detection via Attribute-Driven Graph Representation
Xiang, Sheng, Zhu, Mingzhi, Cheng, Dawei, Li, Enxia, Zhao, Ruihui, Ouyang, Yi, Chen, Ling, Zheng, Yefeng
Credit card fraud incurs a considerable cost for both cardholders and issuing banks. Contemporary methods apply machine learning-based classifiers to detect fraudulent behavior from labeled transaction records. But labeled data are usually a small proportion of billions of real transactions due to expensive labeling costs, which implies that they do not well exploit many natural features from unlabeled data. Therefore, we propose a semi-supervised graph neural network for fraud detection. Specifically, we leverage transaction records to construct a temporal transaction graph, which is composed of temporal transactions (nodes) and interactions (edges) among them. Then we pass messages among the nodes through a Gated Temporal Attention Network (GTAN) to learn the transaction representation. We further model the fraud patterns through risk propagation among transactions. The extensive experiments are conducted on a real-world transaction dataset and two publicly available fraud detection datasets. The result shows that our proposed method, namely GTAN, outperforms other state-of-the-art baselines on three fraud detection datasets. Semi-supervised experiments demonstrate the excellent fraud detection performance of our model with only a tiny proportion of labeled data.
MineAgent: Towards Remote-Sensing Mineral Exploration with Multimodal Large Language Models
Yu, Beibei, Shen, Tao, Na, Hongbin, Chen, Ling, Li, Denqi
Remote-sensing mineral exploration is critical for identifying economically viable mineral deposits, yet it poses significant challenges for multimodal large language models (MLLMs). These include limitations in domain-specific geological knowledge and difficulties in reasoning across multiple remote-sensing images, further exacerbating long-context issues. To address these, we present MineAgent, a modular framework leveraging hierarchical judging and decision-making modules to improve multi-image reasoning and spatial-spectral integration. Complementing this, we propose MineBench, a benchmark specific for evaluating MLLMs in domain-specific mineral exploration tasks using geological and hyperspectral data. Extensive experiments demonstrate the effectiveness of MineAgent, highlighting its potential to advance MLLMs in remote-sensing mineral exploration.
DECRL: A Deep Evolutionary Clustering Jointed Temporal Knowledge Graph Representation Learning Approach
Chen, Qian, Chen, Ling
Temporal Knowledge Graph (TKG) representation learning aims to map temporal evolving entities and relations to embedded representations in a continuous low-dimensional vector space. However, existing approaches cannot capture the temporal evolution of high-order correlations in TKGs. To this end, we propose a Deep Evolutionary Clustering jointed temporal knowledge graph Representation Learning approach (DECRL). Specifically, a deep evolutionary clustering module is proposed to capture the temporal evolution of high-order correlations among entities. Furthermore, a cluster-aware unsupervised alignment mechanism is introduced to ensure the precise one-to-one alignment of soft overlapping clusters across timestamps, thereby maintaining the temporal smoothness of clusters. In addition, an implicit correlation encoder is introduced to capture latent correlations between any pair of clusters under the guidance of a global graph. Extensive experiments on seven real-world datasets demonstrate that DECRL achieves the state-of-the-art performances, outperforming the best baseline by an average of 9.53%, 12.98%, 10.42%, and 14.68% in MRR, Hits@1, Hits@3, and Hits@10, respectively.
Detecting Conversational Mental Manipulation with Intent-Aware Prompting
Ma, Jiayuan, Na, Hongbin, Wang, Zimu, Hua, Yining, Liu, Yue, Wang, Wei, Chen, Ling
Mental manipulation severely undermines mental wellness by covertly and negatively distorting decision-making. While there is an increasing interest in mental health care within the natural language processing community, progress in tackling manipulation remains limited due to the complexity of detecting subtle, covert tactics in conversations. In this paper, we propose Intent-Aware Prompting (IAP), a novel approach for detecting mental manipulations using large language models (LLMs), providing a deeper understanding of manipulative tactics by capturing the underlying intents of participants. Experimental results on the MentalManip dataset demonstrate superior effectiveness of IAP against other advanced prompting strategies. Notably, our approach substantially reduces false negatives, helping detect more instances of mental manipulation with minimal misjudgment of positive cases. The code of this paper is available at https://github.com/Anton-Jiayuan-MA/Manip-IAP.
Risk Management with Feature-Enriched Generative Adversarial Networks (FE-GAN)
Chen, Ling
This paper investigates the application of Feature-Enriched Generative Adversarial Networks (FE-GAN) in financial risk management, with a focus on improving the estimation of Value at Risk (VaR) and Expected Shortfall (ES). FE-GAN enhances existing GANs architectures by incorporating an additional input sequence derived from preceding data to improve model performance. Two specialized GANs models, the Wasserstein Generative Adversarial Network (WGAN) and the Tail Generative Adversarial Network (Tail-GAN), were evaluated under the FE-GAN framework. The results demonstrate that FE-GAN significantly outperforms traditional architectures in both VaR and ES estimation. Tail-GAN, leveraging its task-specific loss function, consistently outperforms WGAN in ES estimation, while both models exhibit similar performance in VaR estimation. Despite these promising results, the study acknowledges limitations, including reliance on highly correlated temporal data and restricted applicability to other domains. Future research directions include exploring alternative input generation methods, dynamic forecasting models, and advanced neural network architectures to further enhance GANs-based financial risk estimation.