Goto

Collaborating Authors

 spiral


SPIRAL: Self-Play Incremental Racing Algorithm for Learning in Multi-Drone Competitions

arXiv.org Artificial Intelligence

This paper introduces SPIRAL (Self-Play Incremental Racing Algorithm for Learning), a novel approach for training autonomous drones in multi-agent racing competitions. SPIRAL distinctively employs a self-play mechanism to incrementally cultivate complex racing behaviors within a challenging, dynamic environment. Through this self-play core, drones continuously compete against increasingly proficient versions of themselves, naturally escalating the difficulty of competitive interactions. This progressive learning journey guides agents from mastering fundamental flight control to executing sophisticated cooperative multi-drone racing strategies. Our method is designed for versatility, allowing integration with any state-of-the-art Deep Reinforcement Learning (DRL) algorithms within its self-play framework. Simulations demonstrate the significant advantages of SPIRAL and benchmark the performance of various DRL algorithms operating within it. Consequently, we contribute a versatile, scalable, and self-improving learning framework to the field of autonomous drone racing. SPIRAL's capacity to autonomously generate appropriate and escalating challenges through its self-play dynamic offers a promising direction for developing robust and adaptive racing strategies in multi-agent environments. This research opens new avenues for enhancing the performance and reliability of autonomous racing drones in increasingly complex and competitive scenarios.



Synergy Over Spiral: A Logistics 5.0 Game-Theoretic Model for Trust-Fatigue Co-regulation in Human-Cobot Order Picking

arXiv.org Artificial Intelligence

This paper investigates the critical role of trust and fatigue in human-cobot collaborative order picking, framing the challenge within the scope of Logistics 5.0: the implementation of human-robot symbiosis in smart logistics. We propose a dynamic, leader-follower Stackelberg game to model this interaction, where utility functions explicitly account for human fatigue and trust. Through agent-based simulations, we demonstrate that while a naive model leads to a "trust death spiral," a refined trust model creates a "trust synergy cycle," increasing productivity by nearly 100 percent. Finally, we show that a cobot operating in a Trust-Recovery Mode can overcome system brittleness after a disruption, reducing trust recovery time by over 75 percent compared to a non-adaptive model. Our findings provide a framework for designing intelligent cobot behaviors that fulfill the Industry 5.0 pillars of human-centricity, sustainability, and resilience.


Space filling positionality and the Spiroformer

arXiv.org Artificial Intelligence

Transformers excel when dealing with sequential data. Generalizing transformer models to geometric domains, such as manifolds, we encounter the problem of not having a well-defined global order. We propose a solution with attention heads following a space-filling curve. As a first experimental example, we present the Spiroformer, a transformer that follows a polar spiral on the $2$-sphere.


SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

arXiv.org Artificial Intelligence

Recent advances in reinforcement learning have shown that language models can develop sophisticated reasoning through training on tasks with verifiable rewards, but these approaches depend on human-curated problem-answer pairs and domain-specific reward engineering. We introduce SPIRAL, a self-play framework where models learn by playing multi-turn, zero-sum games against continuously improving versions of themselves, eliminating the need for human supervision. Through self-play, SPIRAL generates an infinite curriculum of progressively challenging problems as models must constantly adapt to stronger opponents. To enable this self-play training at scale, We implement a fully online, multi-turn, multi-agent reinforcement learning system for LLMs and propose role-conditioned advantage estimation (RAE) to stabilize multi-agent training. Using SPIRAL, self-play on zero-sum games produces reasoning capabilities that transfer broadly. Training Qwen3-4B-Base on Kuhn Poker alone achieves 8.6% improvement on math and 8.4% on general reasoning, outperforming SFT on 25,000 expert game trajectories. Analysis reveals that this transfer occurs through three cognitive patterns: systematic decomposition, expected value calculation, and case-by-case analysis. Multi-game training (TicTacToe, Kuhn Poker, Simple Negotiation) further enhances performance as each game develops distinct reasoning strengths. Applying SPIRAL to a strong reasoning model (DeepSeek-R1-Distill-Qwen-7B) can still lead to 2.0% average improvement. These results demonstrate that zero-sum games naturally develop transferable reasoning capabilities, highlighting a promising direction for autonomous reasoning development.


SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval

arXiv.org Artificial Intelligence

These benchmarks contain only short audio clips and thus do not reflect the complexity of achieving long-context Speech Large Language Models (Speech LLMs) represent a understanding and extracting precise information from lengthy significant advancement in speech language understanding and audio sequences. To systematically assess the unique challenges processing, as they leverage contextual reasoning capabilities of posed by SIR, we present SPIRAL (Speech Informational large language models to process audio inputs [1]. Unlike traditional Retrieval and Lookup), a 1,012-sample benchmark specifically cascaded pipelines, where automatic speech recognition crafted to evaluate Speech LLM performance on long-form (ASR) and language modeling are handled by separate modules, audio sequences (around 90 seconds in duration). On a high Speech LLMs unify audio processing, cross-modal fusion, and level, SPIRAL constructs SIR questions by embedding a critical language modeling in a single architecture [2]. These unified piece of information within lengthy and potentially distracting models can perform multiple tasks like speech recognition, dialogues, thereby assessing the model ability to pinpoint and speech translation, speaker identification and emotion recognition, retrieve essential content from long-form inputs.


The Sound of Silence in Social Networks

arXiv.org Artificial Intelligence

We generalize the classic multi-agent DeGroot model for opinion dynamics to incorporate the Spiral of Silence theory from political science. This theory states that individuals may withhold their opinions when they perceive them to be in the minority. As in the DeGroot model, a community of agents is represented as a weighted directed graph whose edges indicate how much agents influence one another. However, agents whose current opinions are in the minority become silent (i.e., they do not express their opinion). Two models for opinion update are then introduced. In the memoryless opinion model ($\mbox{SOM}^-$), agents update their opinion by taking the weighted average of their non-silent neighbors' opinions. In the memory based opinion model ($\mbox{SOM}^+$), agents update their opinions by taking the weighted average of the opinions of all their neighbors, but for silent neighbors, their most recent opinion is considered. We show that for $\mbox{SOM}^-$ convergence to consensus is guaranteed for clique graphs but, unlike for the classic DeGroot, not guaranteed for strongly-connected aperiodic graphs. In contrast, we show that for $\mbox{SOM}^+$ convergence to consensus is not guaranteed even for clique graphs. We showcase our models through simulations offering experimental insights that align with key aspects of the Spiral of Silence theory. These findings reveal the impact of silence dynamics on opinion formation and highlight the limitations of consensus in more nuanced social models.


Could AI help cure 'downward spiral' of human loneliness?

The Guardian

Hollywood may have warned about the perils of striking up relationships with artificial intelligence, but one computer scientist says we may be missing a trick if we do not embrace the positives that human-machine relationships have to offer. Despite the travails of Joaquin Phoenix's introverted and soon-to-be-divorced protagonist in the 2013 movie Her, one professor says we should be open to the comforts that chatbots can provide. Tony Prescott, professor of cognitive robotics at the University of Sheffield, argues that AI has an important role to play in preventing human loneliness. Just as we develop meaningful bonds with pets, and have no qualms about children playing with dolls, so should we be open to the value of AI to adults, he says. "In an age when many people describe their lives as lonely, there may be value in having AI companionship as a form of reciprocal social interaction that is stimulating and personalised," Prescott writes in a new book, The Psychology of Artificial Intelligence.


AI has already developed sinister skill that scientists say could cause it to 'spiral'

Daily Mail - Science & tech

Many artificial intelligence (AI) systems are already skilled at deceiving and manipulating humans – and this could'spiral' in future, experts have warned. In recent years, the use of AI has grown exponentially but some systems have learned how to be deceitful, even if they have been trained to be helpful and honest, scientists have said. In a review article, a team from the Massachusetts Institute of Technology describe the risks of deception by AI systems and call for governments to develop strong regulations to address this issue as soon as possible. The researchers analyzed previous studies that focused on ways in which AI spread false information through learned deception, meaning they systematically learned how to manipulate others. The most striking example of AI deception they uncovered was Meta's CICERO, a system designed to play the world conquest game Diplomacy that involves building alliances.


Beyond Normal: On the Evaluation of Mutual Information Estimators

arXiv.org Machine Learning

Mutual information is a general statistical dependency measure which has found applications in representation learning, causality, domain generalization and computational biology. However, mutual information estimators are typically evaluated on simple families of probability distributions, namely multivariate normal distribution and selected distributions with one-dimensional random variables. In this paper, we show how to construct a diverse family of distributions with known ground-truth mutual information and propose a language-independent benchmarking platform for mutual information estimators. We discuss the general applicability and limitations of classical and neural estimators in settings involving high dimensions, sparse interactions, long-tailed distributions, and high mutual information. Finally, we provide guidelines for practitioners on how to select appropriate estimator adapted to the difficulty of problem considered and issues one needs to consider when applying an estimator to a new data set.