Goto

Collaborating Authors

 punishment


Neuroscientists Decipher Procrastination: A Brain Mechanism Explains Why People Leave Certain Tasks for Later

WIRED

New research has discovered that a neural circuit may explain procrastination. Scientists were able to disrupt this connection using a drug. The brain avoids unpleasant tasks even if they promise reward, according to a recent study. The reason you decide to postpone household chores and spend your time browsing social media could be explained by the workings of a brain circuit. Recent research has identified a neural connection responsible for delaying the start of activities associated with unpleasant experiences, even when these activities offer a clear reward.





The Physical Basis of Prediction: World Model Formation in Neural Organoids via an LLM-Generated Curriculum

Hill, Brennen

arXiv.org Artificial Intelligence

The capacity of an embodied agent to understand, predict, and interact with its environment is fundamentally contingent on an internal world model. This paper introduces a novel framework for investigating the formation and adaptation of such world models within a biological substrate: human neural organoids. We present a curriculum of three scalable, closed-loop virtual environments designed to train these biological agents and probe the underlying synaptic mechanisms of learning, such as long-term potentiation (LTP) and long-term depression (LTD). We detail the design of three distinct task environments that demand progressively more sophisticated world models for successful decision-making: (1) a conditional avoidance task for learning static state-action contingencies, (2) a one-dimensional predator-prey scenario for goal-directed interaction, and (3) a replication of the classic Pong game for modeling dynamic, continuous-time systems. For each environment, we formalize the state and action spaces, the sensory encoding and motor decoding mechanisms, and the feedback protocols based on predictable (reward) and unpredictable (punishment) stimulation, which serve to drive model refinement. In a significant methodological advance, we propose a meta-learning approach where a Large Language Model automates the generative design and optimization of experimental protocols, thereby scaling the process of environment and curriculum design. Finally, we outline a multi-modal evaluation strategy that moves beyond task performance to directly measure the physical correlates of the learned world model by quantifying synaptic plasticity at electrophysiological, cellular, and molecular levels. This work bridges the gap between model-based reinforcement learning and computational neuroscience, offering a unique platform for studying embodiment, decision-making, and the physical basis of intelligence.


"Don't Teach Minerva": Guiding LLMs Through Complex Syntax for Faithful Latin Translation with RAG

Aguilar, Sergio Torres

arXiv.org Artificial Intelligence

Translating a morphology-rich, low-resource language like Latin poses significant challenges. This paper introduces a reproducible draft-based refinement pipeline that elevates open-source Large Language Models (LLMs) to a performance level statistically comparable to top-tier proprietary systems. Our method first uses a fine-tuned NLLB-1.3B model to generate a high-quality, structurally faithful draft. A zero-shot LLM (Llama-3.3 or Qwen3) then polishes this draft, a process that can be further enhanced by augmenting the context with retrieved out-context examples (RAG). We demonstrate the robustness of this approach on two distinct benchmarks: a standard in-domain test set (Rosenthal, 2023) and a new, challenging out-of-domain (OOD) set of 12th-century Latin letters (2025). Our central finding is that this open-source RAG system achieves performance statistically comparable to the GPT-5 baseline, without any task-specific LLM fine-tuning. We release the pipeline, the Chartres OOD set, and evaluation scripts and models to facilitate replicability and further research.


Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research

Sanaei, Ali, Rajabzadeh, Ali

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly utilized by researchers across a wide range of domains, and qualitative social science is no exception; however, this adoption faces persistent challenges, including interpretive bias, low reliability, and weak auditability. We introduce a framework that situates LLM usage along two dimensions, interpretive depth and autonomy, thereby offering a straightforward way to classify LLM applications in qualitative research and to derive practical design recommendations. We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science that use LLMs as a tool and not strictly as the subject of study. Rather than granting models expansive freedom, our approach encourages researchers to decompose tasks into manageable segments, much as they would when delegating work to capable undergraduate research assistants. By maintaining low levels of autonomy and selectively increasing interpretive depth only where warranted and under supervision, one can plausibly reap the benefits of LLMs while preserving transparency and reliability.


Evolution of Cooperation in LLM-Agent Societies: A Preliminary Study Using Different Punishment Strategies

Warnakulasuriya, Kavindu, Dissanayake, Prabhash, De Silva, Navindu, Cranefield, Stephen, Savarimuthu, Bastin Tony Roy, Ranathunga, Surangika, de Silva, Nisansa

arXiv.org Artificial Intelligence

The evolution of cooperation has been extensively studied using abstract mathematical models and simulations. Recent advances in Large Language Models (LLMs) and the rise of LLM agents have demonstrated their ability to perform social reasoning, thus providing an opportunity to test the emergence of norms in more realistic agent-based simulations with human-like reasoning using natural language. In this research, we investigate whether the cooperation dynamics presented in Boyd and Richerson's model persist in a more realistic simulation of the Diner's Dilemma using LLM agents compared to the abstract mathematical nature in the work of Boyd and Richerson. Our findings indicate that agents follow the strategies defined in the Boyd and Richerson model, and explicit punishment mechanisms drive norm emergence, reinforcing cooperative behaviour even when the agent strategy configuration varies. Our results suggest that LLM-based Multi-Agent System simulations, in fact, can replicate the evolution of cooperation predicted by the traditional mathematical models. Moreover, our simulations extend beyond the mathematical models by integrating natural language-driven reasoning and a pairwise imitation method for strategy adoption, making them a more realistic testbed for cooperative behaviour in MASs.


Could a self-monitoring system for criminals replace prisons one day?

New Scientist

Could a self-monitoring system for criminals replace prisons one day? Future Chronicles is our regular speculative look at inventions yet to come. In this latest installment, we journey to 2050, when technology had been developed so that criminals could be monitored at home. "It's no surprise that the first countries to abolish prisons were Scandinavian " In the 2020s, the US was spending an eye-watering $182 billion a year on locking up its citizens. No other country imprisoned as many people or spent as much in doing so.


Reinforcement learning for spin torque oscillator tasks

Mojsiejuk, Jakub, Ziętek, Sławomir, Skowroński, Witold

arXiv.org Artificial Intelligence

We address the problem of automatic synchronisation of the spintronic oscillator (STO) by means of reinforcement learning (RL). A numerical solution of the macrospin Landau-Lifschitz-Gilbert-Slonczewski equation is used to simulate the STO and we train the two types of RL agents to synchronise with a target frequency within a fixed number of steps. We explore modifications to this base task and show an improvement in both convergence and energy efficiency of the synchronisation that can be easily achieved in the simulated environment.