Goto

Collaborating Authors

 personality detection


Ask, Answer, and Detect: Role-Playing LLMs for Personality Detection with Question-Conditioned Mixture-of-Experts

Lyu, Yifan, Zhang, Liang

arXiv.org Artificial Intelligence

Understanding human personality is crucial for web applications such as personalized recommendation and mental health assessment. Existing studies on personality detection predominantly adopt a "posts -> user vector -> labels" modeling paradigm, which encodes social media posts into user representations for predicting personality labels (e.g., MBTI labels). While recent advances in large language models (LLMs) have improved text encoding capacities, these approaches remain constrained by limited supervision signals due to label scarcity, and under-specified semantic mappings between user language and abstract psychological constructs. We address these challenges by proposing ROME, a novel framework that explicitly injects psychological knowledge into personality detection. Inspired by standardized self-assessment tests, ROME leverages LLMs' role-play capability to simulate user responses to validated psychometric questionnaires. These generated question-level answers transform free-form user posts into interpretable, questionnaire-grounded evidence linking linguistic cues to personality labels, thereby providing rich intermediate supervision to mitigate label scarcity while offering a semantic reasoning chain that guides and simplifies the text-to-personality mapping learning. A question-conditioned Mixture-of-Experts module then jointly routes over post and question representations, learning to answer questionnaire items under explicit supervision. The predicted answers are summarized into an interpretable answer vector and fused with the user representation for final prediction within a multi-task learning framework, where question answering serves as a powerful auxiliary task for personality detection. Extensive experiments on two real-world datasets demonstrate that ROME consistently outperforms state-of-the-art baselines, achieving improvements (15.41% on Kaggle dataset).


HIPPD: Brain-Inspired Hierarchical Information Processing for Personality Detection

Chen, Guanming, Shen, Lingzhi, Cai, Xiaohao, Razzak, Imran, Jameel, Shoaib

arXiv.org Artificial Intelligence

Personality detection from text aims to infer an individual's personality traits based on linguistic patterns. However, existing machine learning approaches often struggle to capture contextual information spanning multiple posts and tend to fall short in extracting representative and robust features in semantically sparse environments. This paper presents HIPPD, a brain-inspired framework for personality detection that emulates the hierarchical information processing of the human brain. HIPPD utilises a large language model to simulate the cerebral cortex, enabling global semantic reasoning and deep feature abstraction. A dynamic memory module, modelled after the prefrontal cortex, performs adaptive gating and selective retention of critical features, with all adjustments driven by dopaminergic prediction error feedback. Subsequently, a set of specialised lightweight models, emulating the basal ganglia, are dynamically routed via a strict winner-takes-all mechanism to capture the personality-related patterns they are most proficient at recognising. Extensive experiments on the Kaggle and Pandora datasets demonstrate that HIPPD consistently outperforms state-of-the-art baselines.


On the Interplay between Musical Preferences and Personality through the Lens of Language

Shem-Tov, Eliran, Rabinovich, Ella

arXiv.org Artificial Intelligence

Music serves as a powerful reflection of individual identity, often aligning with deeper psychological traits. Prior research has established correlations between musical preferences and personality, while separate studies have demonstrated that personality is detectable through linguistic analysis. Our study bridges these two research domains by investigating whether individuals' musical preferences leave traces in their spontaneous language through the lens of the Big Five personality traits (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism). Using a carefully curated dataset of over 500,000 text samples from nearly 5,000 authors with reliably identified musical preferences, we build advanced models to assess personality characteristics. Our results reveal significant personality differences across fans of five musical genres. We release resources for future research at the intersection of computational linguistics, music psychology and personality analysis.


EmoPerso: Enhancing Personality Detection with Self-Supervised Emotion-Aware Modelling

Shen, Lingzhi, Cai, Xiaohao, Long, Yunfei, Razzak, Imran, Chen, Guanming, Jameel, Shoaib

arXiv.org Artificial Intelligence

Personality detection from text is commonly performed by analysing users' social media posts. However, existing methods heavily rely on large-scale annotated datasets, making it challenging to obtain high-quality personality labels. Moreover, most studies treat emotion and personality as independent variables, overlooking their interactions. In this paper, we propose a novel self-supervised framework, EmoPerso, which improves personality detection through emotion-aware modelling. EmoPerso first leverages generative mechanisms for synthetic data augmentation and rich representation learning. It then extracts pseudo-labeled emotion features and jointly optimizes them with personality prediction via multi-task learning. A cross-attention module is employed to capture fine-grained interactions between personality traits and the inferred emotional representations. To further refine relational reasoning, EmoPerso adopts a self-taught strategy to enhance the model's reasoning capabilities iteratively. Extensive experiments on two benchmark datasets demonstrate that EmoPerso surpasses state-of-the-art models. The source code is available at https://github.com/slz0925/EmoPerso.


Less but Better: Parameter-Efficient Fine-Tuning of Large Language Models for Personality Detection

Shen, Lingzhi, Long, Yunfei, Cai, Xiaohao, Chen, Guanming, Razzak, Imran, Jameel, Shoaib

arXiv.org Artificial Intelligence

Shoaib Jameel University of Southampton Southampton, United Kingdom M.S.Jameel@southampton.ac.uk Abstract --Personality detection automatically identifies an individual's personality from various data sources, such as social media texts. However, as the parameter scale of language models continues to grow, the computational cost becomes increasingly difficult to manage. Fine-tuning also grows more complex, making it harder to justify the effort and reliably predict outcomes. We introduce a novel parameter-efficient fine-tuning framework, PersLLM, to address these challenges. By storing the features in the memory layer, we eliminate the need for repeated complex computations by the LLM. Meanwhile, the lightweight output network serves as a proxy for evaluating the overall effectiveness of the framework, improving the predictability of results. Experimental results on key benchmark datasets like Kaggle and Pandora show that PersLLM significantly reduces computational cost while maintaining competitive performance and strong adaptability. I NTRODUCTION Personality refers to the stable traits in an individual's emotions, thoughts, and behaviours that shape how they perceive, interpret, and interact with the world [1], [2]. It is a complex and multifaceted construct that combines various characteristics to form a person's unique identity.


Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits

Li, Bohan, Guan, Jiannan, Dou, Longxu, Feng, Yunlong, Wang, Dingzirui, Xu, Yang, Wang, Enbo, Chen, Qiguang, Wang, Bichen, Xu, Xiao, Zhang, Yimeng, Qin, Libo, Zhao, Yanyan, Zhu, Qingfu, Che, Wanxiang

arXiv.org Artificial Intelligence

The Myers-Briggs Type Indicator (MBTI) is one of the most influential personality theories reflecting individual differences in thinking, feeling, and behaving. MBTI personality detection has garnered considerable research interest and has evolved significantly over the years. However, this task tends to be overly optimistic, as it currently does not align well with the natural distribution of population personality traits. Specifically, (1) the self-reported labels in existing datasets result in incorrect labeling issues, and (2) the hard labels fail to capture the full range of population personality distributions. In this paper, we optimize the task by constructing MBTIBench, the first manually annotated high-quality MBTI personality detection dataset with soft labels, under the guidance of psychologists. As for the first challenge, MBTIBench effectively solves the incorrect labeling issues, which account for 29.58% of the data. As for the second challenge, we estimate soft labels by deriving the polarity tendency of samples. The obtained soft labels confirm that there are more people with non-extreme personality traits. Experimental results not only highlight the polarized predictions and biases in LLMs as key directions for future research, but also confirm that soft labels can provide more benefits to other psychological tasks than hard labels. The code and data are available at https://github.com/Personality-NLP/MbtiBench.


Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection

Zhu, Haohao, Zhang, Xiaokun, Lu, Junyu, Yang, Liang, Lin, Hongfei

arXiv.org Artificial Intelligence

Textual personality detection aims to identify personality traits by analyzing user-generated content. To achieve this effectively, it is essential to thoroughly examine user-generated content from various perspectives. However, previous studies have struggled with automatically extracting and effectively integrating information from multiple perspectives, thereby limiting their performance on personality detection. To address these challenges, we propose the Multi-view Mixture-of-Experts Model for Textual Personality Detection (MvP). MvP introduces a Multi-view Mixture-of-Experts (MoE) network to automatically analyze user posts from various perspectives. Additionally, it employs User Consistency Regularization to mitigate conflicts among different perspectives and learn a multi-view generic user representation. The model's training is optimized via a multi-task joint learning strategy that balances supervised personality detection with self-supervised user consistency constraints. Experimental results on two widely-used personality detection datasets demonstrate the effectiveness of the MvP model and the benefits of automatically analyzing user posts from diverse perspectives for textual personality detection.


EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection

Li, Zheng, Zhu, Dawei, Ma, Qilong, Xiong, Weimin, Li, Sujian

arXiv.org Artificial Intelligence

Personality is a fundamental construct in psychology, reflecting an individual's behavior, thinking, and emotional patterns. Previous researches have made some progress in personality detection, primarily by utilizing the whole text to predict personality. However, these studies generally tend to overlook psychological knowledge: they rarely apply the well-established correlations between emotion regulation and personality. Based on this, we propose a new personality detection method called EERPD. This method introduces the use of emotion regulation, a psychological concept highly correlated with personality, for personality prediction. By combining this feature with emotion features, it retrieves few-shot examples and provides process CoTs for inferring labels from text. This approach enhances the understanding of LLM for personality within text and improves the performance in personality detection. Experimental results demonstrate that EERPD significantly enhances the accuracy and robustness of personality detection, outperforming previous SOTA by 15.05/4.29 in average F1 on the two benchmark datasets.


Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives

Zhu, Haohao, Zhang, Xiaokun, Lu, Junyu, Wu, Youlin, Bai, Zewen, Min, Changrong, Yang, Liang, Xu, Bo, Zhang, Dongyu, Lin, Hongfei

arXiv.org Artificial Intelligence

Textual personality detection aims to identify personality characteristics by analyzing user-generated content toward social media platforms. Numerous psychological literature highlighted that personality encompasses both long-term stable traits and short-term dynamic states. However, existing studies often concentrate only on either long-term or short-term personality representations, without effectively combining both aspects. This limitation hinders a comprehensive understanding of individuals' personalities, as both stable traits and dynamic states are vital. To bridge this gap, we propose a Dual Enhanced Network(DEN) to jointly model users' long-term and short-term personality for textual personality detection. In DEN, a Long-term Personality Encoding is devised to effectively model long-term stable personality traits. Short-term Personality Encoding is presented to capture short-term dynamic personality states. The Bi-directional Interaction component facilitates the integration of both personality aspects, allowing for a comprehensive representation of the user's personality. Experimental results on two personality detection datasets demonstrate the effectiveness of the DEN model and the benefits of considering both the dynamic and stable nature of personality characteristics for textual personality detection.


LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model

Hu, Linmei, He, Hongyu, Wang, Duokang, Zhao, Ziwang, Shao, Yingxia, Nie, Liqiang

arXiv.org Artificial Intelligence

Personality detection aims to detect one's personality traits underlying in social media posts. One challenge of this task is the scarcity of ground-truth personality traits which are collected from self-report questionnaires. Most existing methods learn post features directly by fine-tuning the pre-trained language models under the supervision of limited personality labels. This leads to inferior quality of post features and consequently affects the performance. In addition, they treat personality traits as one-hot classification labels, overlooking the semantic information within them. In this paper, we propose a large language model (LLM) based text augmentation enhanced personality detection model, which distills the LLM's knowledge to enhance the small model for personality detection, even when the LLM fails in this task. Specifically, we enable LLM to generate post analyses (augmentations) from the aspects of semantic, sentiment, and linguistic, which are critical for personality detection. By using contrastive learning to pull them together in the embedding space, the post encoder can better capture the psycho-linguistic information within the post representations, thus improving personality detection. Furthermore, we utilize the LLM to enrich the information of personality labels for enhancing the detection performance. Experimental results on the benchmark datasets demonstrate that our model outperforms the state-of-the-art methods on personality detection.