Goto

Collaborating Authors

 neural state


LENS: Learning Ensemble Confidence from Neural States for Multi-LLM Answer Integration

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated impressive performance across various tasks, with different models excelling in distinct domains and specific abilities. Effectively combining the predictions of multiple LLMs is crucial for enhancing system robustness and performance. However, existing ensemble methods often rely on simple techniques like voting or logits ensembling, which overlook the varying confidence and reliability of models in different contexts. In this work, we propose LENS (Learning ENsemble confidence from Neural States), a novel approach that learns to estimate model confidence by analyzing internal representations. For each LLM, we train a lightweight linear confidence predictor that leverages layer-wise hidden states and normalized probabilities as inputs. This allows for more nuanced weighting of model predictions based on their context-dependent reliability. Our method does not require modifying the model parameters and requires negligible additional computation. Experimental results on multiple-choice and boolean question-answering tasks demonstrate that LENS outperforms traditional ensemble methods by a substantial margin. Our findings suggest that internal representations provide valuable signals for determining model confidence and can be effectively leveraged for ensemble learning.


Conceptual and Design Principles for a Self-Referential Algorithm Mimicking Neuronal Assembly Functions

arXiv.org Artificial Intelligence

However, the epistemological approach differs from that of so-called "grounded cognition". We can summarise this difference as follows: while grounded cognition analyses the experience of a living system from the point of view of an observer, we adopt the point of view of the system itself, defined by the need to preserve the biological properties essential for its survival. Therefore, our proposal implies the idea that the system is self-referential, since it operates with the aim of being able to continue operating. The method is based on an algorithmic schema that we called Environment Generative Operator (EGO) and uses an object language developed for this purpose, that we called E-language. EGO simulates cognitive processes by manipulating E-language strings. Among all the feasible ones, an EGO model called "EGO-P" (Supplementary Material 2) was implemented and tested, achieving the expected objectives. Repositories 2 and 3, as all the others mentioned in the article, can be accessed via the corresponding link in the bibliography. E-language has various mathematical properties. Those useful for this work have been demonstrated and are available in Supplementary Material 1.


Embodied World Models Emerge from Navigational Task in Open-Ended Environments

arXiv.org Artificial Intelligence

Spatial reasoning in partially observable environments has often been approached through passive predictive models, yet theories of embodied cognition suggest that genuinely useful representations arise only when perception is tightly coupled to action. Here we ask whether a recurrent agent, trained solely by sparse rewards to solve procedurally generated planar mazes, can autonomously internalize metric concepts such as direction, distance and obstacle layout. After training, the agent consistently produces near-optimal paths in unseen mazes, behavior that hints at an underlying spatial model. To probe this possibility, we cast the closed agent-environment loop as a hybrid dynamical system, identify stable limit cycles in its state space, and characterize behavior with a Ridge Representation that embeds whole trajectories into a common metric space. Canonical correlation analysis exposes a robust linear alignment between neural and behavioral manifolds, while targeted perturbations of the most informative neural dimensions sharply degrade navigation performance. Taken together, these dynamical, representational, and causal signatures show that sustained sensorimotor interaction is sufficient for the spontaneous emergence of compact, embodied world models, providing a principled path toward interpretable and transferable navigation policies.


Automated Discovery of Continuous Dynamics from Videos

arXiv.org Artificial Intelligence

Dynamical systems are predominantly described by physical laws, which involve a set of physical variables to represent the system's states and a set of equations to connect these variables to model the system's evolution over time. To uncover these physical laws from natural phenomena, scientists begin by identifying the appropriate physical variables, such as position and velocity in classical mechanics, electric current and magnetic induction in electromagnetism, and pressure and velocity fields in fluid mechanics, and measure them from raw observations of the system. Subsequently, they derive equations from these variables to articulate the underlying physical principles, such as Newton's laws in classical mechanics, Maxwell's equations in electromagnetism, and the Navier-Stokes equations in fluid mechanics. By applying various mathematical tools to analyze these equations, scientists gain a profound understanding of natural phenomena and can predict the system's future behaviors. This paradigm of scientific discovery, most well-known since the work of Tycho Brahe and Johannes Kepler from more than 400 years ago, has been remarkably successful across almost all areas of modern science. Despite centuries' efforts, using a similar paradigm to discover physical variables and equations for new systems still remains challenging, as seen in early attempts to automate the process of discovering equations from given physical variables [1-3].


A theory of neural emulators

arXiv.org Artificial Intelligence

A central goal in neuroscience is to provide explanations for how animal nervous systems can generate actions and cognitive states such as consciousness while artificial intelligence (AI) and machine learning (ML) seek to provide models that are increasingly better at prediction. Despite many decades of research we have made limited progress on providing neuroscience explanations yet there is an increased use of AI and ML methods in neuroscience for prediction of behavior and even cognitive states. Here we propose emulator theory (ET) and neural emulators as circuit- and scale-independent predictive models of biological brain activity and emulator theory (ET) as an alternative research paradigm in neuroscience. ET proposes that predictive models trained solely on neural dynamics and behaviors can generate functionally indistinguishable systems from their sources. That is, compared to the biological organisms which they model, emulators may achieve indistinguishable behavior and cognitive states - including consciousness - without any mechanistic explanations. We posit ET via several conjectures, discuss the nature of endogenous and exogenous activation of neural circuits, and discuss neural causality of phenomenal states. ET provides the conceptual and empirical framework for prediction-based models of neural dynamics and behavior without explicit representations of idiosyncratically evolved nervous systems.



Learning to Act through Evolution of Neural Diversity in Random Neural Networks

arXiv.org Artificial Intelligence

Biological nervous systems consist of networks of diverse, sophisticated information processors in the form of neurons of different classes. In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons within a layer or even the whole network; training of ANNs focuses on synaptic optimization. In this paper, we propose the optimization of neuro-centric parameters to attain a set of diverse neurons that can perform complex computations. Demonstrating the promise of the approach, we show that evolving neural parameters alone allows agents to solve various reinforcement learning tasks without optimizing any synaptic weights. While not aiming to be an accurate biological model, parameterizing neurons to a larger degree than the current common practice, allows us to ask questions about the computational abilities afforded by neural diversity in random neural networks. The presented results open up interesting future research directions, such as combining evolved neural diversity with activity-dependent plasticity.


Switching state space model for simultaneously estimating state transitions and nonstationary firing rates

Neural Information Processing Systems

We propose an algorithm for simultaneously estimating state transitions among neural states, the number of neural states, and nonstationary firing rates using a switching state space model (SSSM). This model enables us to detect state transitions based not only on the discontinuous changes of mean firing rates but also on discontinuous changes in temporal profiles of firing rates, e.g., temporal correlation. We derive a variational Bayes algorithm for a non-Gaussian SSSM whose non-Gaussian property is caused by binary spike events. Synthetic data analysis reveals the high performance of our algorithm in estimating state transitions, the number of neural states, and nonstationary firing rates compared to previous methods. We also analyze neural data recorded from the medial temporal area.


Switching state space model for simultaneously estimating state transitions and nonstationary firing rates

Neural Information Processing Systems

We propose an algorithm for simultaneously estimating state transitions among neural states, the number of neural states, and nonstationary firing rates using a switching state space model (SSSM). This model enables us to detect state transitions based not only on the discontinuous changes of mean firing rates but also on discontinuous changes in temporal profiles of firing rates, e.g., temporal correlation. We derive a variational Bayes algorithm for a non-Gaussian SSSM whose non-Gaussian property is caused by binary spike events. Synthetic data analysis reveals the high performance of our algorithm in estimating state transitions, the number of neural states, and nonstationary firing rates compared to previous methods. We also analyze neural data recorded from the medial temporal area.


Switching state space model for simultaneously estimating state transitions and nonstationary firing rates

Neural Information Processing Systems

We propose an algorithm for simultaneously estimating state transitions among neural states, the number of neural states, and nonstationary firing rates using a switching state space model (SSSM). This model enables us to detect state transitions based not only on the discontinuous changes of mean firing rates but also on discontinuous changes in temporal profiles of firing rates, e.g., temporal correlation. We derive a variational Bayes algorithm for a non-Gaussian SSSM whose non-Gaussian property is caused by binary spike events. Synthetic data analysis reveals the high performance of our algorithm in estimating state transitions, the number of neural states, and nonstationary firing rates compared to previous methods. We also analyze neural data recorded from the medial temporal area. The statistically detected neural states probably coincide with transient and sustained states, which have been detected heuristically. Estimated parameters suggest that our algorithm detects the state transition based on discontinuous change in the temporal correlation of firing rates, which transitions previous methods cannot detect. This result suggests the advantage of our algorithm in real-data analysis.