Agents
Optimal Transport-based Domain Alignment as a Preprocessing Step for Federated Learning
Pereira, Luiz Manella, Amini, M. Hadi
It offers a compelling framework for scenarios in which data cannot be centrally aggregated due to privacy constraints, thereby promoting compliance with data protection regulations and enhancing scalability [1]. Beyond its foundational role in privacy-preserving learning, FL also facilitates model personalization--adapting learning outcomes to individual users across the network--an increasingly relevant objective given the heterogeneity of user behavior and datasets. A comprehensive overview of the challenges and practical implementations of personalized federated learning is presented in [2]. Despite its broad applicability, particularly in contexts with stringent data privacy constraints, FL introduces a set of constraints that must be carefully addressed to ensure robust and efficient model training. These constraints include limited communication bandwidth, restricted computation at edge devices, privacy preservation requirements, and data heterogeneity and imbalance. Dataset imbalance in FL emerges when edge devices possess non-uniform class distributions, disparate dataset sizes, or varying data quality [3, 4]. In this work, we propose a preprocessing framework that addresses this imbalance challenge in a model-and algorithm-agnostic manner. Our method aligns and transforms local datasets into a shared representation space that captures statistical information from all participating agents in the network.
Interpretability by Design for Efficient Multi-Objective Reinforcement Learning
Xia, Qiyue, Herrmann, J. Michael
Multi-objective reinforcement learning (MORL) aims at optimising several, often conflicting goals in order to improve flexibility and reliability of RL in practical tasks. This can be achieved by finding diverse policies that are optimal for some objective preferences and non-dominated by optimal policies for other preferences so that they form a Pareto front in the multi-objective performance space. The relation between the multi-objective performance space and the parameter space that represents the policies is generally non-unique. Using a training scheme that is based on a locally linear map between the parameter space and the performance space, we show that an approximate Pareto front can provide an interpretation of the current parameter vectors in terms of the objectives which enables an effective search within contiguous solution domains. Experiments are conducted with and without retraining across different domains, and the comparison with previous methods demonstrates the efficiency of our approach.
Learning Equilibria in Matching Games with Bandit Feedback
Athanasopoulos, Andreas, Dimitrakakis, Christos
We investigate the problem of learning an equilibrium in a generalized two-sided matching market, where agents can adaptively choose their actions based on their assigned matches. Specifically, we consider a setting in which matched agents engage in a zero-sum game with initially unknown payoff matrices, and we explore whether a centralized procedure can learn an equilibrium from bandit feedback. We adopt the solution concept of matching equilibrium, where a pair consisting of a matching $\mathfrak{m}$ and a set of agent strategies $X$ forms an equilibrium if no agent has the incentive to deviate from $(\mathfrak{m}, X)$. To measure the deviation of a given pair $(\mathfrak{m}, X)$ from the equilibrium pair $(\mathfrak{m}^\star, X^\star)$, we introduce matching instability that can serve as a regret measure for the corresponding learning problem. We then propose a UCB algorithm in which agents form preferences and select actions based on optimistic estimates of the game payoffs, and prove that it achieves sublinear, instance-independent regret over a time horizon $T$.
Misalignment or misuse? The AGI alignment tradeoff
Hellrigel-Holderbaum, Max, Dung, Leonard
Creating systems that are aligned with our goals is seen as a leading approach to create safe and beneficial AI in both leading AI companies and the academic field of AI safety. We defend the view that misaligned AGI - future, generally intelligent (robotic) AI agents - poses catastrophic risks. At the same time, we support the view that aligned AGI creates a substantial risk of catastrophic misuse by humans. While both risks are severe and stand in tension with one another, we show that - in principle - there is room for alignment approaches which do not increase misuse risk. We then investigate how the tradeoff between misalignment and misuse looks em pirically for different technical approaches to AI alignment. Here, we argue that many current alignment techniques and foreseeable improvements thereof plausibly increase risks of catastrophic misuse. Since the impacts of AI depend on the social context, we close by discussing important social factors and suggest that to reduce the risk of a misuse catastrophe due to aligned AGI, techniques such as robustness, AI control methods and especially good governance seem essential.
Computational Architects of Society: Quantum Machine Learning for Social Rule Genesis
The quantification of social science remains a longstanding challenge, largely due to the philosophical nature of its foundational theories. Although quantum computing has advanced rapidly in recent years, its relevance to social theory remains underexplored. Most existing research focuses on micro-cognitive models or philosophical analogies, leaving a gap in system-level applications of quantum principles to the analysis of social systems. This study addresses that gap by proposing a theoretical and computational framework that combines quantum mechanics with Generative AI to simulate the emergence and evolution of social norms. Drawing on core quantum concepts--such as superposition, entanglement, and probabilistic measurement--this research models society as a dynamic, uncertain system and sets up five ideal-type experiments. These scenarios are simulated using 25 generative agents, each assigned evolving roles as compliers, resistors, or enforcers. Within a simulated environment monitored by a central observer (the Watcher), agents interact, respond to surveillance, and adapt to periodic normative disruptions. These interactions allow the system to self-organize under external stress and reveal emergent patterns. Key findings show that quantum principles, when integrated with generative AI, enable the modeling of uncertainty, emergence, and interdependence in complex social systems. Simulations reveal patterns including convergence toward normative order, the spread of resistance, and the spontaneous emergence of new equilibria in social rules. In conclusion, this study introduces a novel computational lens that lays the groundwork for a quantum-informed social theory. It offers interdisciplinary insights into how society can be understood not just as a structure to observe but as a dynamic system to simulate and redesign through quantum technologies.
The Future of Continual Learning in the Era of Foundation Models: Three Key Directions
Bell, Jack, Quarantiello, Luigi, Coleman, Eric Nuertey, Li, Lanpei, Li, Malio, Madeddu, Mauro, Piccoli, Elia, Lomonaco, Vincenzo
Continual learning--the ability to acquire, retain, and refine knowledge over time--has always been fundamental to intelligence, both human and artificial. Historically, different AI paradigms have acknowledged this need, albeit with varying priorities: early expert and production systems focused on incremental knowledge consolidation, while reinforcement learning emphasised dynamic adaptation. With the rise of deep learning, deep continual learning has primarily focused on learning robust and reusable representations over time to solve sequences of increasingly complex tasks. However, the emergence of Large Language Models (LLMs) and foundation models has raised the question: Do we still need continual learning when centralised, monolithic models can tackle diverse tasks with access to internet-scale knowledge? We argue that continual learning remains essential for three key reasons: (i) continual pre-training is still necessary to ensure foundation models remain up to date, mitigating knowledge staleness and distribution shifts while integrating new information; (ii) continual fine-tuning enables models to specialise and personalise, adapting to domain-specific tasks, user preferences, and real-world constraints without full retraining, avoiding the need for computationally expensive long context-windows; (iii) continual compositionality offers a scalable and modular approach to intelligence, enabling the orchestration of foundation models and agents to be dynamically composed, recombined, and adapted. While continual pre-training and fine-tuning are explored as niche research directions, we argue it is continual compositionality that will mark the rebirth of continual learning. The future of AI will not be defined by a single static model but by an ecosystem of continually evolving and interacting models, making continual learning more relevant than ever.
Perplexity's CEO Sees AI Agents as the Next Web Battleground
Perplexity has tapped into the power of generative artificial intelligence--with all its problematic tendencies--in an effort to challenge Google as the dominant way people find information online. The AI search tool rose in prominence in 2024 and was lauded as a promising alternative to Googling. It has been accused by Forbes of plagiarizing its news articles, closely paraphrasing other websites, and hallucinating incorrect information. Despite the furor, Perplexity today says that its service gets 650 million queries per month and is said to be chasing investment that would value the company at 18 billion. The company is pushing AI assistants for mobile devices and working on its own web browser.
Olfactory Inertial Odometry: Methodology for Effective Robot Navigation by Scent
France, Kordel K., Daescu, Ovidiu
--Olfactory navigation is one of the most primitive mechanisms of exploration used by organisms. Navigation by machine olfaction (artificial smell) is a very difficult task to both simulate and solve. With this work, we define olfactory inertial odometry (OIO), a framework for using inertial kinematics, and fast-sampling olfaction sensors to enable navigation by scent analogous to visual inertial odometry (VIO). We establish how principles from SLAM and VIO can be extrapolated to olfaction to enable real-world robotic tasks. We demonstrate OIO with three different odour localization algorithms on a real 5-DoF robot arm over an odour-tracking scenario that resembles real applications in agriculture and food quality control. Our results indicate success in establishing a baseline framework for OIO from which other research in olfactory navigation can build, and we note performance enhancements that can be made to address more complex tasks in the future. From the first life forms to complex mammals, the ability to navigate using scent has been a cornerstone of survival. Animals like ants, hounds, and rodents demonstrate remarkable proficiency in following odour plumes and pheromone trails to locate food, mates, or shelter. These feats are achieved through a sophisticated interplay between acute scent receptors and motion. However, the physical behavior of odour plumes--constantly shifting with wind, influenced by temperature and humidity, and weakening over time--presents a formidable challenge. When the odour source is out of sight, organisms rely entirely on olfactory cues, transforming the task into a complex control problem that demands robust uncertainty management.
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Wu, Qianhui, Cheng, Kanzhi, Yang, Rui, Zhang, Chaoyun, Yang, Jianwei, Jiang, Huiqiang, Mu, Jian, Peng, Baolin, Qiao, Bo, Tan, Reuben, Qin, Si, Liden, Lars, Lin, Qingwei, Zhang, Huan, Zhang, Tong, Zhang, Jianbing, Zhang, Dongmei, Gao, Jianfeng
One of the principal challenges in building VLM-powered GUI agents is visual grounding, i.e., localizing the appropriate screen region for action execution based on both the visual content and the textual plans. Most existing work formulates this as a text-based coordinate generation task. However, these approaches suffer from several limitations: weak spatial-semantic alignment, inability to handle ambiguous supervision targets, and a mismatch between the dense nature of screen coordinates and the coarse, patch-level granularity of visual features extracted by models like Vision Transformers. In this paper, we propose GUI-Actor, a VLM-based method for coordinate-free GUI grounding. At its core, GUI-Actor introduces an attention-based action head that learns to align a dedicated
FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems
Metelo, Frederico, Oliveira, Alexandre, Racković, Stevo, Costa, Pedro Ákos, Soares, Cláudia
Edge computing addresses the growing data demands of connected-device networks by placing computational resources closer to end users through decentralized infrastructures. This decentralization challenges traditional, fully centralized orchestration, which suffers from latency and resource bottlenecks. We present \textbf{FAuNO} -- \emph{Federated Asynchronous Network Orchestrator} -- a buffered, asynchronous \emph{federated reinforcement-learning} (FRL) framework for decentralized task offloading in edge systems. FAuNO adopts an actor-critic architecture in which local actors learn node-specific dynamics and peer interactions, while a federated critic aggregates experience across agents to encourage efficient cooperation and improve overall system performance. Experiments in the \emph{PeersimGym} environment show that FAuNO consistently matches or exceeds heuristic and federated multi-agent RL baselines in reducing task loss and latency, underscoring its adaptability to dynamic edge-computing scenarios.