timepoint
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- Media (0.34)
- Leisure & Entertainment (0.34)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (0.93)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)
Detecting Perspective Shifts in Multi-agent Systems
Bridgeford, Eric, Helm, Hayden
Generative models augmented with external tools and update mechanisms (or \textit{agents}) have demonstrated capabilities beyond intelligent prompting of base models. As agent use proliferates, dynamic multi-agent systems have naturally emerged. Recent work has investigated the theoretical and empirical properties of low-dimensional representations of agents based on query responses at a single time point. This paper introduces the Temporal Data Kernel Perspective Space (TDKPS), which jointly embeds agents across time, and proposes several novel hypothesis tests for detecting behavioral change at the agent- and group-level in black-box multi-agent systems. We characterize the empirical properties of our proposed tests, including their sensitivity to key hyperparameters, in simulations motivated by a multi-agent system of evolving digital personas. Finally, we demonstrate via natural experiment that our proposed tests detect changes that correlate sensitively, specifically, and significantly with a real exogenous event. As far as we are aware, TDKPS is the first principled framework for monitoring behavioral dynamics in black-box multi-agent systems -- a critical capability as generative agent deployment continues to scale.
- Europe > Austria > Vienna (0.14)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (2 more...)
- Research Report > New Finding (0.70)
- Research Report > Experimental Study (0.69)
- Government (0.67)
- Health & Medicine > Therapeutic Area (0.48)
Decomposing Theory of Mind: How Emotional Processing Mediates ToM Abilities in LLMs
Recent work shows activation steering substantially improves language models' Theory of Mind (ToM) (Bortoletto et al. 2024), yet the mechanisms of what changes occur internally that leads to different outputs remains unclear. We propose decomposing ToM in LLMs by comparing steered versus baseline LLMs' activations using linear probes trained on 45 cognitive actions. We applied Contrastive Activation Addition (CAA) steering to Gemma-3-4B and evaluated it on 1,000 BigToM forward belief scenarios (Gandhi et al. 2023), we find improved performance on belief attribution tasks (32.5\% to 46.7\% accuracy) is mediated by activations processing emotional content : emotion perception (+2.23), emotion valuing (+2.20), while suppressing analytical processes: questioning (-0.78), convergent thinking (-1.59). This suggests that successful ToM abilities in LLMs are mediated by emotional understanding, not analytical reasoning.
Energy Guided Geometric Flow Matching
Zweig, Aaron, Zhang, Mingxuan, Azizi, Elham, Knowles, David
A useful inductive bias for temporal data is that trajectories should stay close to the data manifold. Traditional flow matching relies on straight conditional paths, and flow matching methods which learn geodesics rely on RBF kernels or nearest neighbor graphs that suffer from the curse of dimensionality. We propose to use score matching and annealed energy distillation to learn a metric tensor that faithfully captures the underlying data geometry and informs more accurate flows. We demonstrate the efficacy of this strategy on synthetic manifolds with analytic geodesics, and interpolation of cell
- Europe > Netherlands > South Holland > Leiden (0.05)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Europe > Portugal > Castelo Branco > Castelo Branco (0.04)
- Asia > Middle East > Israel (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Hematology (0.68)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > Switzerland (0.04)
- Asia > Middle East > Jordan (0.04)
Appendix A Data
Hz to remove contributions from electrical line noise and other very high frequency noise. We further refer to this as the 2v2 accuracy . Under this metric, chance performance is 50% . The next step is to train and evaluate the proposed models. The distance used in our experiments is cosine distance. This model has 12 layers, 12 attention heads, and 768 hidden units.