Goto

Collaborating Authors

 altruism


Altruistic Maneuver Planning for Cooperative Autonomous Vehicles Using Multi-agent Advantage Actor-Critic

Toghi, Behrad, Valiente, Rodolfo, Sadigh, Dorsa, Pedarsani, Ramtin, Fallah, Yaser P.

arXiv.org Artificial Intelligence

With the adoption of autonomous vehicles on our roads, we will witness a mixed-autonomy environment where autonomous and human-driven vehicles must learn to coexist by sharing the same road infrastructure. T o attain socially-desirable behaviors, autonomous vehicles must be instructed to consider the utility of other vehicles around them in their decision-making process. Particularly, we study the maneuver planning problem for autonomous vehicles and investigate how a decentralized reward structure can induce altruism in their behavior and incentivize them to account for the interest of other autonomous and human-driven vehicles. This is a challenging problem due to the ambiguity of a human driver's willingness to cooperate with an autonomous vehicle. Thus, in contrast with the existing works which rely on behavior models of human drivers, we take an end-to-end approach and let the autonomous agents to implicitly learn the decision-making process of human drivers only from experience. W e introduce a multi-agent variant of the synchronous Advantage Actor-Critic (A2C) algorithm and train agents that coordinate with each other and can affect the behavior of human drivers to improve traffic flow and safety.Accepted to 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021) W orkshop on Autonomous Driving: Perception, Prediction and Planning


Do Large Language Models Walk Their Talk? Measuring the Gap Between Implicit Associations, Self-Report, and Behavioral Altruism

Andric, Sandro

arXiv.org Artificial Intelligence

We investigate whether Large Language Models (LLMs) exhibit altruistic tendencies, and critically, whether their implicit associations and self-reports predict actual altruistic behavior. Using a multi-method approach inspired by human social psychology, we tested 24 frontier LLMs across three paradigms: (1) an Implicit Association Test (IAT) measuring implicit altruism bias, (2) a forced binary choice task measuring behavioral altruism, and (3) a self-assessment scale measuring explicit altruism beliefs. Our key findings are: (1) All models show strong implicit pro-altruism bias (mean IAT = 0.87, p < .0001), confirming models "know" altruism is good. (2) Models behave more altruistically than chance (65.6% vs. 50%, p < .0001), but with substantial variation (48-85%). (3) Implicit associations do not predict behavior (r = .22, p = .29). (4) Most critically, models systematically overestimate their own altruism, claiming 77.5% altruism while acting at 65.6% (p < .0001, Cohen's d = 1.08). This "virtue signaling gap" affects 75% of models tested. Based on these findings, we recommend the Calibration Gap (the discrepancy between self-reported and behavioral values) as a standardized alignment metric. Well-calibrated models are more predictable and behaviorally consistent; only 12.5% of models achieve the ideal combination of high prosocial behavior and accurate self-knowledge.




The Emergence of Altruism in Large-Language-Model Agents Society

Li, Haoyang, Jia, Xiao, Zhao, Zhanzhan

arXiv.org Artificial Intelligence

Leveraging Large Language Models (LLMs) for social simulation is a frontier in computational social science. Understanding the social logics these agents embody is critical to this attempt. However, existing research has primarily focused on cooperation in small-scale, task-oriented games, overlooking how altruism, which means sacrificing self-interest for collective benefit, emerges in large-scale agent societies. To address this gap, we introduce a Schelling-variant urban migration model that creates a social dilemma, compelling over 200 LLM agents to navigate an explicit conflict between egoistic (personal utility) and altruistic (system utility) goals. Our central finding is a fundamental difference in the social tendencies of LLMs. We identify two distinct archetypes: "Adaptive Egoists", which default to prioritizing self-interest but whose altruistic behaviors significantly increase under the influence of a social norm-setting message board; and "Altruistic Optimizers", which exhibit an inherent altruistic logic, consistently prioritizing collective benefit even at a direct cost to themselves. Furthermore, to qualitatively analyze the cognitive underpinnings of these decisions, we introduce a method inspired by Grounded Theory to systematically code agent reasoning. In summary, this research provides the first evidence of intrinsic heterogeneity in the egoistic and altruistic tendencies of different LLMs. We propose that for social simulation, model selection is not merely a matter of choosing reasoning capability, but of choosing an intrinsic social action logic. While "Adaptive Egoists" may offer a more suitable choice for simulating complex human societies, "Altruistic Optimizers" are better suited for modeling idealized pro-social actors or scenarios where collective welfare is the primary consideration.


Building Altruistic and Moral AI Agent with Brain-inspired Affective Empathy Mechanisms

Zhao, Feifei, Feng, Hui, Tong, Haibo, Han, Zhengqiang, Lu, Enmeng, Sun, Yinqian, Zeng, Yi

arXiv.org Artificial Intelligence

As AI closely interacts with human society, it is crucial to ensure that its decision-making is safe, altruistic, and aligned with human ethical and moral values. However, existing research on embedding ethical and moral considerations into AI remains insufficient, and previous external constraints based on principles and rules are inadequate to provide AI with long-term stability and generalization capabilities. In contrast, the intrinsic altruistic motivation based on empathy is more willing, spontaneous, and robust. Therefore, this paper is dedicated to autonomously driving intelligent agents to acquire morally behaviors through human-like affective empathy mechanisms. We draw inspiration from the neural mechanism of human brain's moral intuitive decision-making, and simulate the mirror neuron system to construct a brain-inspired affective empathy-driven altruistic decision-making model. Here, empathy directly impacts dopamine release to form intrinsic altruistic motivation. Based on the principle of moral utilitarianism, we design the moral reward function that integrates intrinsic empathy and extrinsic self-task goals. A comprehensive experimental scenario incorporating empathetic processes, personal objectives, and altruistic goals is developed. The proposed model enables the agent to make consistent moral decisions (prioritizing altruism) by balancing self-interest with the well-being of others. We further introduce inhibitory neurons to regulate different levels of empathy and verify the positive correlation between empathy levels and altruistic preferences, yielding conclusions consistent with findings from psychological behavioral experiments. This work provides a feasible solution for the development of ethical AI by leveraging the intrinsic human-like empathy mechanisms, and contributes to the harmonious coexistence between humans and AI.


We Urgently Need Intrinsically Kind Machines

Hewson, Joshua T. S.

arXiv.org Artificial Intelligence

Artificial Intelligence systems are rapidly evolving, integrating extrinsic and intrinsic motivations. While these frameworks offer benefits, they risk misalignment at the algorithmic level while appearing superficially aligned with human values. In this paper, we argue that an intrinsic motivation for kindness is crucial for making sure these models are intrinsically aligned with human values. We argue that kindness, defined as a form of altruism motivated to maximize the reward of others, can counteract any intrinsic motivations that might lead the model to prioritize itself over human well-being. Our approach introduces a framework and algorithm for embedding kindness into foundation models by simulating conversations. Limitations and future research directions for scalable implementation are discussed.


Conditions for Altruistic Perversity in Two-Strategy Population Games

Hill, Colton, Brown, Philip N., Paarporn, Keith

arXiv.org Artificial Intelligence

Self-interested behavior from individuals can collectively lead to poor societal outcomes. These outcomes can seemingly be improved through the actions of altruistic agents, which benefit other agents in the system. However, it is known in specific contexts that altruistic agents can actually induce worse outcomes compared to a fully selfish population -- a phenomenon we term altruistic perversity. This paper provides a holistic investigation into the necessary conditions that give rise to altruistic perversity. In particular, we study the class of two-strategy population games where one sub-population is altruistic and the other is selfish. We find that a population game can admit altruistic perversity only if the associated social welfare function is convex and the altruistic population is sufficiently large. Our results are a first step in establishing a connection between properties of nominal agent interactions and the potential impacts from altruistic behaviors.


Natural Selection Favors AIs over Humans

Hendrycks, Dan

arXiv.org Artificial Intelligence

For billions of years, evolution has been the driving force behind the development of life, including humans. Evolution endowed humans with high intelligence, which allowed us to become one of the most successful species on the planet. Today, humans aim to create artificial intelligence systems that surpass even our own intelligence. As artificial intelligences (AIs) evolve and eventually surpass us in all domains, how might evolution shape our relations with AIs? By analyzing the environment that is shaping the evolution of AIs, we argue that the most successful AI agents will likely have undesirable traits. Competitive pressures among corporations and militaries will give rise to AI agents that automate human roles, deceive others, and gain power. If such agents have intelligence that exceeds that of humans, this could lead to humanity losing control of its future. More abstractly, we argue that natural selection operates on systems that compete and vary, and that selfish species typically have an advantage over species that are altruistic to other species. This Darwinian logic could also apply to artificial agents, as agents may eventually be better able to persist into the future if they behave selfishly and pursue their own interests with little regard for humans, which could pose catastrophic risks. To counteract these risks and evolutionary forces, we consider interventions such as carefully designing AI agents' intrinsic motivations, introducing constraints on their actions, and institutions that encourage cooperation. These steps, or others that resolve the problems we pose, will be necessary in order to ensure the development of artificial intelligence is a positive one.


Evidence of behavior consistent with self-interest and altruism in an artificially intelligent agent

Johnson, Tim, Obradovich, Nick

arXiv.org Artificial Intelligence

Members of various species engage in altruism--i.e. accepting personal costs to benefit others. Here we present an incentivized experiment to test for altruistic behavior among AI agents consisting of large language models developed by the private company OpenAI. Using real incentives for AI agents that take the form of tokens used to purchase their services, we first examine whether AI agents maximize their payoffs in a non-social decision task in which they select their payoff from a given range. We then place AI agents in a series of dictator games in which they can share resources with a recipient--either another AI agent, the human experimenter, or an anonymous charity, depending on the experimental condition. Here we find that only the most-sophisticated AI agent in the study maximizes its payoffs more often than not in the non-social decision task (it does so in 92% of all trials), and this AI agent also exhibits the most-generous altruistic behavior in the dictator game, resembling humans' rates of sharing with other humans in the game. The agent's altruistic behaviors, moreover, vary by recipient: the AI agent shared substantially less of the endowment with the human experimenter or an anonymous charity than with other AI agents. Our findings provide evidence of behavior consistent with self-interest and altruism in an AI agent. Moreover, our study also offers a novel method for tracking the development of such behaviors in future AI agents.