Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering

Ong, Kenneth J. K., Jun, Lye Jia, Nguyen, Hieu Minh "Jord", Cho, Seong Hah, Antolín, Natalia Pérez-Campanero

Mar-16-2025–arXiv.org Artificial Intelligence

As Large Language Models (LLMs) gain autonomous capabilities, their coordination in multi-agent settings becomes increasingly important. However, they often struggle with cooperation, leading to suboptimal outcomes. Inspired by Axelrod's Iterated Prisoner's Dilemma (IPD) tournaments, we explore how personality traits influence LLM cooperation. Using representation engineering, we steer Big Five traits (e.g., Agreeableness, Conscientiousness) in LLMs and analyze their impact on IPD decision-making. Our results show that higher Agreeableness and Conscientiousness improve cooperation but increase susceptibility to exploitation, highlighting both the potential and limitations of personality-based steering for aligning AI agents.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

Mar-16-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Singapore (0.04)
- North America > Mexico
  - Mexico City > Mexico City (0.04)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning > Agents (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found