Goto

Collaborating Authors

 information processing


Appendix

Neural Information Processing Systems

A.4 EstimatingparameterswhenY(t)isunavailable New parameter estimators that leverage only the available data need to be derived whenY(t) is unavailable. The derivation goes as follows: first, we eliminateY(t) from the model equations. The squared error of the estimated parameters are shown in Figure 1. First, we estimated the parameters separately for each individual. Second, we performed statistical analysis to find associations between the estimated parameters and the demographic variables.


A Quantifiable Information-Processing Hierarchy Provides a Necessary Condition for Detecting Agency

Kagan, Brett J., Baccetti, Valentina, Earp, Brian D., Boyd, J. Lomax, Savulescu, Julian, Razi, Adeel

arXiv.org Machine Learning

As intelligent systems are developed across diverse substrates - from machine learning models and neuromorphic hardware to in vitro neural cultures - understanding what gives a system agency has become increasingly important. Existing definitions, however, tend to rely on top-down descriptions that are difficult to quantify. We propose a bottom-up framework grounded in a system's information-processing order: the extent to which its transformation of input evolves over time. We identify three orders of information processing. Class I systems are reactive and memoryless, mapping inputs directly to outputs. Class II systems incorporate internal states that provide memory but follow fixed transformation rules. Class III systems are adaptive; their transformation rules themselves change as a function of prior activity. While not sufficient on their own, these dynamics represent necessary informational conditions for genuine agency. This hierarchy offers a measurable, substrate-independent way to identify the informational precursors of agency. We illustrate the framework with neurophysiological and computational examples, including thermostats and receptor-like memristors, and discuss its implications for the ethical and functional evaluation of systems that may exhibit agency.


The brain-AI convergence: Predictive and generative world models for general-purpose computation

Ohmae, Shogo, Ohmae, Keiko

arXiv.org Artificial Intelligence

Recent advances in general-purpose AI systems with attention-based transformers offer a potential window into how the neocortex and cerebellum, despite their relatively uniform circuit architectures, give rise to diverse functions and, ultimately, to human intelligence. This Perspective provides a cross-domain comparison between the brain and AI that goes beyond the traditional focus on visual processing, adopting the emerging perspecive of world-model-based computation. Here, we identify shared computational mechanisms in the attention-based neocortex and the non-attentional cerebellum: both predict future world events from past inputs and construct internal world models through prediction-error learning. These predictive world models are repurposed for seemingly distinct functions -- understanding in sensory processing and generation in motor processing -- enabling the brain to achieve multi-domain capabilities and human-like adaptive intelligence. Notably, attention-based AI has independently converged on a similar learning paradigm and world-model-based computation. We conclude that these shared mechanisms in both biological and artificial systems constitute a core computational foundation for realizing diverse functions including high-level intelligence, despite their relatively uniform circuit structures. Our theoretical insights bridge neuroscience and AI, advancing our understanding of the computational essence of intelligence.


BrainHGT: A Hierarchical Graph Transformer for Interpretable Brain Network Analysis

Ma, Jiajun, Zhang, Yongchao, Zhang, Chao, Lv, Zhao, Pei, Shengbing

arXiv.org Artificial Intelligence

Graph Transformer shows remarkable potential in brain network analysis due to its ability to model graph structures and complex node relationships. Most existing methods typically model the brain as a flat network, ignoring its modular structure, and their attention mechanisms treat all brain region connections equally, ignoring distance-related node connection patterns. However, brain information processing is a hierarchical process that involves local and long-range interactions between brain regions, interactions between regions and sub-functional modules, and interactions among functional modules themselves. This hierarchical interaction mechanism enables the brain to efficiently integrate local computations and global information flow, supporting the execution of complex cognitive functions. To address this issue, we propose BrainHGT, a hierarchical Graph Transformer that simulates the brain's natural information processing from local regions to global communities. Specifically, we design a novel long-short range attention encoder that utilizes parallel pathways to handle dense local interactions and sparse long-range connections, thereby effectively alleviating the over-globalizing issue. To further capture the brain's modular architecture, we designe a prior-guided clustering module that utilizes a cross-attention mechanism to group brain regions into functional communities and leverage neuroanatomical prior to guide the clustering process, thereby improving the biological plausibility and interpretability. Experimental results indicate that our proposed method significantly improves performance of disease identification, and can reliably capture the sub-functional modules of the brain, demonstrating its interpretability.



Decision-Making Amid Information-Based Threats in Sociotechnical Systems: A Review

Allred, Aaron R., Richardson, Erin E., Bostrom, Sarah R., Crum, James, Spencer, Cara, Tossell, Chad, Niemeyer, Richard E., Hirshfield, Leanne, Hayman, Allison P. A.

arXiv.org Artificial Intelligence

Technological systems increasingly mediate human information exchange, spanning interactions among humans as well as between humans and artificial agents. The unprecedented scale and reliance on information disseminated through these systems substantially expand the scope of information-based influence that can both enable and undermine sound decision-making. Consequently, understanding and protecting decision-making today faces growing challenges, as individuals and organizations must navigate evolving opportunities and information-based threats across varied domains and information environments. While these risks are widely recognized, research remains fragmented: work evaluating information-based threat phenomena has progressed largely in isolation from foundational studies of human information processing. In this review, we synthesize insights from both domains to identify shared cognitive mechanisms that mediate vulnerability to information-based threats and shape behavioral outcomes. Finally, we outline directions for future research aimed at integrating these perspectives, emphasizing the importance of such integration for mitigating human vulnerabilities and aligning human-machine representations.


Mirror Eyes: Explainable Human-Robot Interaction at a Glance

Krüger, Matti, Tanneberg, Daniel, Wang, Chao, Hasler, Stephan, Gienger, Michael

arXiv.org Artificial Intelligence

The gaze of a person tends to reflect their interest. This work explores what happens when this statement is taken literally and applied to robots. Here we present a robot system that employs a moving robot head with a screen-based eye model that can direct the robot's gaze to points in physical space and present a reflection-like mirror image of the attended region on top of each eye. We conducted a user study with 33 participants, who were asked to instruct the robot to perform pick-and-place tasks, monitor the robot's task execution, and interrupt it in case of erroneous actions. Despite a deliberate lack of instructions about the role of the eyes and a very brief system exposure, participants felt more aware about the robot's information processing, detected erroneous actions earlier, and rated the user experience higher when eye-based mirroring was enabled compared to non-reflective eyes. These results suggest a beneficial and intuitive utilization of the introduced method in cooperative human-robot interaction.


Development of Mental Models in Human-AI Collaboration: A Conceptual Framework

Holstein, Joshua, Satzger, Gerhard

arXiv.org Artificial Intelligence

Artificial intelligence has become integral to organizational decision - making and while research has explored many facets of this human - AI collaboration, the focus has mainly been on designing the AI agent(s) and the way the collaboration is set up -- generally assuming a human decision - maker to be "fixed". However, it has largely been neglected that decision - makers' mental models evolve through their continuous interaction with AI systems. This paper addresses this gap by conceptualizing how the design of human - AI collaboration influences the development of three complementary and interdependent mental models necessary for this collaboration. We develop an integrated socio - technical framework that identifies the mechanisms driving the mental model evolution: data contextualization, reasoning transparency, and performance feedback.


MARS: Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning

Chen, Guoxin, Qiao, Zile, Wang, Wenqing, Yu, Donglei, Chen, Xuanzhong, Sun, Hao, Liao, Minpeng, Fan, Kai, Jiang, Yong, Xie, Penguin, Zhao, Wayne Xin, Song, Ruihua, Huang, Fei

arXiv.org Artificial Intelligence

Large Reasoning Models (LRMs) often exhibit a tendency for overanalysis in simple tasks, where the models excessively utilize System 2-type, deliberate reasoning, leading to inefficient token generation. Furthermore, these models face challenges in adapting their reasoning capabilities to rapidly changing environments due to the static nature of their pretraining data. To address these issues, advancing Large Language Models (LLMs) for complex reasoning tasks requires innovative approaches that bridge intuitive and deliberate cognitive processes, akin to human cognition's dual-system dynamic. This paper introduces a Multi-Agent System for Deep ReSearch (MARS) enabling seamless integration of System 1's fast, intuitive thinking with System 2's deliberate reasoning within LLMs. MARS strategically integrates multiple external tools, such as Google Search, Google Scholar, and Python Interpreter, to access up-to-date information and execute complex computations, while creating a specialized division of labor where System 1 efficiently processes and summarizes high-volume external information, providing distilled insights that expand System 2's reasoning context without overwhelming its capacity. Furthermore, we propose a multi-agent reinforcement learning framework extending Group Relative Policy Optimization to simultaneously optimize both systems with multi-turn tool interactions, bin-packing optimization, and sample balancing strategies that enhance collaborative efficiency. Extensive experiments demonstrate MARS achieves substantial improvements of 3.86% on the challenging Humanity's Last Exam (HLE) benchmark and an average gain of 8.9% across 7 knowledge-intensive tasks, validating the effectiveness of our dual-system paradigm for complex reasoning in dynamic information environments.


Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models

Guan, JingChuan, Kubota, Tomoyuki, Kuniyoshi, Yasuo, Nakajima, Kohei

arXiv.org Artificial Intelligence

State space models (SSMs) have gained attention by showing potential to outperform Transformers. However, previous studies have not sufficiently addressed the mechanisms underlying their high performance owing to a lack of theoretical explanation of SSMs' learning dynamics. In this study, we provide such an explanation and propose an improved training strategy. The memory capacity of SSMs can be evaluated by examining how input time series are stored in their current state. Such an examination reveals a tradeoff between memory accuracy and length, as well as the theoretical equivalence between the structured state space sequence model (S4) and a simplified S4 with diagonal recurrent weights. This theoretical foundation allows us to elucidate the learning dynamics, proving the importance of initial parameters. Our analytical results suggest that successful learning requires the initial memory structure to be the longest possible even if memory accuracy may deteriorate or the gradient lose the teacher information. Experiments on tasks requiring long memory confirmed that extending memory is difficult, emphasizing the importance of initialization. Furthermore, we found that fixing recurrent weights can be more advantageous than adapting them because it achieves comparable or even higher performance with faster convergence. Our results provide a new theoretical foundation for SSMs and potentially offer a novel optimization strategy.