Goto

Collaborating Authors

 npc


How LLMs are Shaping the Future of Virtual Reality

Özkaya, Süeda, Berrezueta-Guzman, Santiago, Wagner, Stefan

arXiv.org Artificial Intelligence

The integration of Large Language Models (LLMs) into Virtual Reality (VR) games marks a paradigm shift in the design of immersive, adaptive, and intelligent digital experiences. This paper presents a comprehensive review of recent research at the intersection of LLMs and VR, examining how these models are transforming narrative generation, non-player character (NPC) interactions, accessibility, personalization, and game mastering. Drawing from an analysis of 62 peer reviewed studies published between 2018 and 2025, we identify key application domains ranging from emotionally intelligent NPCs and procedurally generated storytelling to AI-driven adaptive systems and inclusive gameplay interfaces. We also address the major challenges facing this convergence, including real-time performance constraints, memory limitations, ethical risks, and scalability barriers. Our findings highlight that while LLMs significantly enhance realism, creativity, and user engagement in VR environments, their effective deployment requires robust design strategies that integrate multimodal interaction, hybrid AI architectures, and ethical safeguards. The paper concludes by outlining future research directions in multimodal AI, affective computing, reinforcement learning, and open-source development, aiming to guide the responsible advancement of intelligent and inclusive VR systems.



A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity

Neural Information Processing Systems

Deep learning (DL) algorithms rely on massive amounts of labeled data. Semi-supervised learning (SSL) and active learning (AL) aim to reduce this label complexity by leveraging unlabeled data or carefully acquiring labels, respectively.




Fixed-Persona SLMs with Modular Memory: Scalable NPC Dialogue on Consumer Hardware

Braas, Martin, Esterle, Lukas

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text, yet their applicability to dialogue systems in computer games remains limited. This limitation arises from their substantial hardware requirements, latency constraints, and the necessity to maintain clearly defined knowledge boundaries within a game setting. In this paper, we propose a modular NPC dialogue system that leverages Small Language Models (SLMs), fine-tuned to encode specific NPC personas and integrated with runtime-swappable memory modules. These memory modules preserve character-specific conversational context and world knowledge, enabling expressive interactions and long-term memory without retraining or model reloading during gameplay. We comprehensively evaluate our system using three open-source SLMs: DistilGPT-2, TinyLlama-1.1B-Chat, and Mistral-7B-Instruct, trained on synthetic persona-aligned data and benchmarked on consumer-grade hardware. While our approach is motivated by applications in gaming, its modular design and persona-driven memory architecture hold significant potential for broader adoption in domains requiring expressive, scalable, and memory-rich conversational agents, such as virtual assistants, customer support bots, or interactive educational systems.


Symbolically Scaffolded Play: Designing Role-Sensitive Prompts for Generative NPC Dialogue

Figueiredo, Vanessa, Elumeze, David

arXiv.org Artificial Intelligence

Large Language Models (LLMs) promise to transform interactive games by enabling non-player characters (NPCs) to sustain unscripted dialogue. Yet it remains unclear whether constrained prompts actually improve player experience. We investigate this question through The Interview, a voice-based detective game powered by GPT-4o. A within-subjects usability study ($N=10$) compared high-constraint (HCP) and low-constraint (LCP) prompts, revealing no reliable experiential differences beyond sensitivity to technical breakdowns. Guided by these findings, we redesigned the HCP into a hybrid JSON+RAG scaffold and conducted a synthetic evaluation with an LLM judge, positioned as an early-stage complement to usability testing. Results uncovered a novel pattern: scaffolding effects were role-dependent: the Interviewer (quest-giver NPC) gained stability, while suspect NPCs lost improvisational believability. These findings overturn the assumption that tighter constraints inherently enhance play. Extending fuzzy-symbolic scaffolding, we introduce \textit{Symbolically Scaffolded Play}, a framework in which symbolic structures are expressed as fuzzy, numerical boundaries that stabilize coherence where needed while preserving improvisation where surprise sustains engagement.


Aligning Large Language Models with Procedural Rules: An Autoregressive State-Tracking Prompting for In-Game Trading

Kim, Minkyung, Kim, Junsik, Yang, Woongcheol, Park, Sangdon, Bae, Sohee

arXiv.org Artificial Intelligence

Large Language Models (LLMs) enable dynamic game interactions but fail to follow essential procedural flows in rule-governed trading systems, eroding player trust. This work resolves the core tension between the creative flexibility of LLMs and the procedural demands of in-game trading (browse-offer-review-confirm). To this end, Autoregressive State-Tracking Prompting (ASTP) is introduced, a methodology centered on a strategically orchestrated prompt that compels an LLM to make its state-tracking process explicit and verifiable. Instead of relying on implicit contextual understanding, ASTP tasks the LLM with identifying and reporting a predefined state label from the previous turn. To ensure transactional integrity, this is complemented by a state-specific placeholder post-processing method for accurate price calculations. Evaluation across 300 trading dialogues demonstrates >99% state compliance and 99.3% calculation precision. Notably, ASTP with placeholder post-processing on smaller models (Gemini-2.5-Flash) matches larger models' (Gemini-2.5-Pro) performance while reducing response time from 21.2s to 2.4s, establishing a practical foundation that satisfies both real-time requirements and resource constraints of commercial games.


Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs

Buakhaw, Pasin, Kerdthaisong, Kun, Phenhiran, Phuree, Khlaisamniang, Pitikorn, Vorathammathorn, Supasate, Ittichaiwong, Piyalitt, Yongsatianchot, Nutchanon

arXiv.org Artificial Intelligence

The emergence of large language models (LLMs) has opened new opportunities for creating dynamic non-player characters (NPCs) in gaming environments, enabling both functional task execution and persona-consistent dialogue generation. In this paper, we (Tu_Character_lab) report our participation in the Commonsense Persona-Grounded Dialogue Challenge (CPDC) 2025 Round 2, which evaluates agents across three tracks: task-oriented dialogue, context-aware dialogue, and their integration. Our approach combines two complementary strategies: (i) lightweight prompting techniques in the API track, including a Deflanderization prompting method to suppress excessive role-play and improve task fidelity, and (ii) fine-tuned large models in the GPU track, leveraging Qwen3-14B with supervisedfinetuning (SFT) and Low-Rank Adaptation(LoRA). Our best submissions ranked 2nd on Task 1, 2nd on Task 3 (API track), and 4th on Task 3 (GPU track).


Combining Reinforcement Learning and Behavior Trees for NPCs in Video Games with AMD Schola

Liu, Tian, Cann, Alex, Colbert, Ian, Saeedi, Mehdi

arXiv.org Artificial Intelligence

For example, a recent study [1] concludes that NPCs based on behavior trees (BTs) are still more viable than those based on machine learning (ML), calling for new approaches, strategies, and tooling to overcome the barrier to adoption. Additional work has also underscored the need for reusable and adjustable models [2], motivated by game developers' preferences to reuse previously developed assets, provided that reuse does not result in repetitive gameplay. Traditional BT approaches and modern RL techniques each have their respective strengths and limitations in video game development. BTs offer a structured and hierarchical method for managing NPC behaviors, enabling the design of complex systems with predictable outcomes given sufficient development time. However, this complexity can make multi-task BTs less engaging and cumbersome to develop [2]. Conversely, RL provides a dynamic and adaptive approach to decision making [3], allowing developers to guide an agent through trial-and-error. However, training generally-capable RL models remains a challenge, particularly due to reward shaping, negative task transfer [4, 5], and compute resource demands [6].