Goto

Collaborating Authors

 stag


LLMs as Policy-Agnostic Teammates: A Case Study in Human Proxy Design for Heterogeneous Agent Teams

Justus, Aju Ani, Baber, Chris

arXiv.org Artificial Intelligence

A critical challenge in modelling Heterogeneous-Agent Teams is training agents to collaborate with teammates whose policies are inaccessible or non-stationary, such as humans. Traditional approaches rely on expensive human-in-the-loop data, which limits scalability. We propose using Large Language Models (LLMs) as policy-agnostic human proxies to generate synthetic data that mimics human decision-making. To evaluate this, we conduct three experiments in a grid-world capture game inspired by Stag Hunt, a game theory paradigm that balances risk and reward. In Experiment 1, we compare decisions from 30 human participants and 2 expert judges with outputs from LLaMA 3.1 and Mixtral 8x22B models. LLMs, prompted with game-state observations and reward structures, align more closely with experts than participants, demonstrating consistency in applying underlying decision criteria. Experiment 2 modifies prompts to induce risk-sensitive strategies (e.g. "be risk averse"). LLM outputs mirror human participants' variability, shifting between risk-averse and risk-seeking behaviours. Finally, Experiment 3 tests LLMs in a dynamic grid-world where the LLM agents generate movement actions. LLMs produce trajectories resembling human participants' paths. While LLMs cannot yet fully replicate human adaptability, their prompt-guided diversity offers a scalable foundation for simulating policy-agnostic teammates.


Quantizing Text-attributed Graphs for Semantic-Structural Integration

Bo, Jianyuan, Wu, Hao, Fang, Yuan

arXiv.org Artificial Intelligence

Text-attributed graphs (TAGs) have emerged as a powerful representation for modeling complex relationships across diverse domains. With the rise of large language models (LLMs), there is growing interest in leveraging their capabilities for graph learning. However, current approaches face significant challenges in embedding structural information into LLM-compatible formats, requiring either computationally expensive alignment mechanisms or manual graph verbalization techniques that often lose critical structural details. Moreover, these methods typically require labeled data from source domains for effective transfer learning, significantly constraining their adaptability. We propose STAG, a novel self-supervised framework that directly quantizes graph structural information into discrete tokens using a frozen codebook. Unlike traditional quantization approaches, our method employs soft assignment and KL divergence guided quantization to address the unique challenges of graph data, which lacks natural tokenization structures. Our framework enables both LLM-based and traditional learning approaches, supporting true zero-shot transfer learning without requiring labeled data even in the source domain. Extensive experiments demonstrate state-of-the-art performance across multiple node classification benchmarks while maintaining compatibility with different LLM architectures, offering an elegant solution to bridging graph learning with LLMs.


Facets in Argumentation: A Formal Approach to Argument Significance

Fichte, Johannes, Fröhlich, Nicolas, Hecher, Markus, Lagerkvist, Victor, Mahmood, Yasir, Meier, Arne, Persson, Jonathan

arXiv.org Artificial Intelligence

Argumentation is a central subarea of Artificial Intelligence (AI) for modeling and reasoning about arguments. The semantics of abstract argumentation frameworks (AFs) is given by sets of arguments (extensions) and conditions on the relationship between them, such as stable or admissible. Today's solvers implement tasks such as finding extensions, deciding credulous or skeptical acceptance, counting, or enumerating extensions. While these tasks are well charted, the area between decision, counting/enumeration and fine-grained reasoning requires expensive reasoning so far. We introduce a novel concept (facets) for reasoning between decision and enumeration. Facets are arguments that belong to some extensions (credulous) but not to all extensions (skeptical). They are most natural when a user aims to navigate, filter, or comprehend the significance of specific arguments, according to their needs. We study the complexity and show that tasks involving facets are much easier than counting extensions. Finally, we provide an implementation, and conduct experiments to demonstrate feasibility.


Seeing World Dynamics in a Nutshell

Shen, Qiuhong, Yi, Xuanyu, Lin, Mingbao, Zhang, Hanwang, Yan, Shuicheng, Wang, Xinchao

arXiv.org Artificial Intelligence

We consider the problem of efficiently representing casually captured monocular videos in a spatially- and temporally-coherent manner. While existing approaches predominantly rely on 2D/2.5D techniques treating videos as collections of spatiotemporal pixels, they struggle with complex motions, occlusions, and geometric consistency due to absence of temporal coherence and explicit 3D structure. Drawing inspiration from monocular video as a projection of the dynamic 3D world, we explore representing videos in their intrinsic 3D form through continuous flows of Gaussian primitives in space-time. In this paper, we propose NutWorld, a novel framework that efficiently transforms monocular videos into dynamic 3D Gaussian representations in a single forward pass. At its core, NutWorld introduces a structured spatial-temporal aligned Gaussian (STAG) representation, enabling optimization-free scene modeling with effective depth and flow regularization. Through comprehensive experiments, we demonstrate that NutWorld achieves high-fidelity video reconstruction quality while enabling various downstream applications in real-time. Demos and code will be available at https://github.com/Nut-World/NutWorld.


Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning

Chatterji, Satchit, Acar, Erman

arXiv.org Artificial Intelligence

An important challenge for enabling the deployment of reinforcement learning (RL) algorithms in the real world is safety. This has resulted in the recent research field of Safe RL, which aims to learn optimal policies that are safe. One successful approach in that direction is probabilistic logic shields (PLS), a model-based Safe RL technique that uses formal specifications based on probabilistic logic programming, constraining an agent's policy to comply with those specifications in a probabilistic sense. However, safety is inherently a multi-agent concept, since real-world environments often involve multiple agents interacting simultaneously, leading to a complex system which is hard to control. Moreover, safe multi-agent RL (Safe MARL) is still underexplored. In order to address this gap, in this paper we ($i$) introduce Shielded MARL (SMARL) by extending PLS to MARL -- in particular, we introduce Probabilistic Logic Temporal Difference Learning (PLTD) to enable shielded independent Q-learning (SIQL), and introduce shielded independent PPO (SIPPO) using probabilistic logic policy gradients; ($ii$) show its positive effect and use as an equilibrium selection mechanism in various game-theoretic environments including two-player simultaneous games, extensive-form games, stochastic games, and some grid-world extensions in terms of safety, cooperation, and alignment with normative behaviors; and ($iii$) look into the asymmetric case where only one agent is shielded, and show that the shielded agent has a significant influence on the unshielded one, providing further evidence of SMARL's ability to enhance safety and cooperation in diverse multi-agent environments.


Spectral Toolkit of Algorithms for Graphs: Technical Report (2)

Macgregor, Peter, Sun, He

arXiv.org Artificial Intelligence

Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source C++ and Python library providing several methods for working with graphs and performing graph-based data analysis. In this technical report, we provide an update on the development of the STAG library. The report serves as a user's guide for the newly implemented algorithms, and gives implementation details and engineering choices made in the development of the library. The report is structured as follows: Section 2 describes the locality sensitive hashing, and the main components used in its construction. Section 3 describes the kernel density estimation, and the state-of-the-art algorithm for the kernel density estimation.


Spectral Toolkit of Algorithms for Graphs: Technical Report (1)

Macgregor, Peter, Sun, He

arXiv.org Artificial Intelligence

Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source C++ and Python library of efficient spectral algorithms for graphs. Our objective is to implement advanced graph algorithms developed through algorithmic spectral graph theory, while making it practical to end users. This series of technical reports is to document our progress on STAG, including implementation details, engineering considerations, and the data sets against which our implementation is tested. The report is structured as follows: Section 2 describes the local clustering algorithm, which is the main update in this STAG release. The discussion is at a high level such that domain knowledge beyond basic algorithms is not needed. Section 3 provides a user guide to the essential features of STAG which allow a user to apply local clustering. Section 4 includes experiments and demonstrations of the functionality of STAG. Finally, Section 5 discusses several technical details; these include our choice of implemented algorithms, the default setup of parameters, and other technical choices. We leave these details to the final section, as it's not necessary for the reader to understand this when using STAG.


Stochastic Aggregation in Graph Neural Networks

Wang, Yuanqing, Karaletsos, Theofanis

arXiv.org Artificial Intelligence

We herein present a unifying framework for stochastic aggregation (STAG) in GNNs, where noise is (adaptively) Nonetheless, such aggregation scheme also causes limitations injected into the aggregation process from of GNNs. Firstly, without proper choices of aggregation the neighborhood to form node embeddings. We functions, GNNs are not always as powerful as WL provide theoretical arguments that STAG models, test. When pooling from (transformed) neighborhood representations, with little overhead, remedy both of the aforementioned if the underlying set for the neighborhood problems. In addition to fixed-noise multiset (See Definition 1 of Xu et al. (2018)) is countable, models, we also propose probabilistic versions of as has been studied in detail in Xu et al. (2018), although STAG models and a variational inference framework different multiset functions learn different attributes of the to learn the noise posterior. We conduct illustrative neighborhood--MAX learns distinct elements and MEAN experiments clearly targeting oversmoothing learns distributions--only SUM is injective and thus capable and multiset aggregation limitations.


Representing Pure Nash Equilibria in Argumentation

Yun, Bruno, Vesic, Srdjan, Oren, Nir

arXiv.org Artificial Intelligence

In this paper we describe an argumentation-based representation of normal form games, and demonstrate how argumentation can be used to compute pure strategy Nash equilibria. Our approach builds on Modgil's Extended Argumentation Frameworks. We demonstrate its correctness, prove several theoretical properties it satisfies, and outline how it can be used to explain why certain strategies are Nash equilibria to a non-expert human user.


Tactical Reward Shaping: Bypassing Reinforcement Learning with Strategy-Based Goals

Zhang, Yizheng, Rosendo, Andre

arXiv.org Artificial Intelligence

Deep Reinforcement Learning (DRL) has shown its promising capabilities to learn optimal policies directly from trial and error. However, learning can be hindered if the goal of the learning, defined by the reward function, is "not optimal". We demonstrate that by setting the goal/target of competition in a counter-intuitive but intelligent way, instead of heuristically trying solutions through many hours the DRL simulation can quickly converge into a winning strategy. The ICRA-DJI RoboMaster AI Challenge is a game of cooperation and competition between robots in a partially observable environment, quite similar to the Counter-Strike game. Unlike the traditional approach to games, where the reward is given at winning the match or hitting the enemy, our DRL algorithm rewards our robots when in a geometric-strategic advantage, which implicitly increases the winning chances. Furthermore, we use Deep Q Learning (DQL) to generate multi-agent paths for moving, which improves the cooperation between two robots by avoiding the collision. Finally, we implement a variant A* algorithm with the same implicit geometric goal as DQL and compare results. We conclude that a well-set goal can put in question the need for learning algorithms, with geometric-based searches outperforming DQL in many orders of magnitude.