causal state
From monoliths to modules: Decomposing transducers for efficient world modelling
Boyd, Alexander, Nowak, Franz, Hyland, David, Baltieri, Manuel, Rosas, Fernando E.
World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although realistic world models often have high computational demands, efficient modelling is usually possible by exploiting the fact that real-world scenarios tend to involve subcomponents that interact in a modular manner. In this paper, we explore this idea by developing a framework for decomposing complex world models represented by transducers, a class of models gen-eralising POMDPs. Whereas the composition of transducers is well understood, our results clarify how to invert this process deriving sub-transducers operating on distinct input-output subspaces, enabling parallelizable and interpretable alternatives to monolithic world modelling that can support distributed inference. Overall, these results lay a groundwork for bridging the structural transparency demanded by AI safety and the computational efficiency required for real-world inference.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (6 more...)
Learning Causal States Under Partial Observability and Perturbation
Li, Na, Shan, Hangguan, Ni, Wei, Zhang, Wenjie, Li, Xinyu, Wang, Yamin
A critical challenge for reinforcement learning (RL) is making decisions based on incomplete and noisy observations, especially in perturbed and partially observable Markov decision processes (P$^2$OMDPs). Existing methods fail to mitigate perturbations while addressing partial observability. We propose \textit{Causal State Representation under Asynchronous Diffusion Model (CaDiff)}, a framework that enhances any RL algorithm by uncovering the underlying causal structure of P$^2$OMDPs. This is achieved by incorporating a novel asynchronous diffusion model (ADM) and a new bisimulation metric. ADM enables forward and reverse processes with different numbers of steps, thus interpreting the perturbation of P$^2$OMDP as part of the noise suppressed through diffusion. The bisimulation metric quantifies the similarity between partially observable environments and their causal counterparts. Moreover, we establish the theoretical guarantee of CaDiff by deriving an upper bound for the value function approximation errors between perturbed observations and denoised causal states, reflecting a principled trade-off between approximation errors of reward and transition-model. Experiments on Roboschool tasks show that CaDiff enhances returns by at least 14.18\% compared to baselines. CaDiff is the first framework that approximates causal states using diffusion models with both theoretical rigor and practicality.
Agentic Information Theory: Ergodicity and Intrinsic Semantics of Information Processes
Crutchfield, James P., Jurgens, Alexandra
We develop information theory for the temporal behavior of memoryful agents moving through complex -- structured, stochastic -- environments. We introduce and explore information processes -- stochastic processes produced by cognitive agents in real-time as they interact with and interpret incoming stimuli. We provide basic results on the ergodicity and semantics of the resulting time series of Shannon information measures that monitor an agent's adapting view of uncertainty and structural correlation in its environment.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (8 more...)
Way More Than the Sum of Their Parts: From Statistical to Structural Mixtures
We show that mixtures comprised of multicomponent systems typically are much more structurally complex than the sum of their parts; sometimes, infinitely more complex. We contrast this with the more familiar notion of statistical mixtures, demonstrating how statistical mixtures miss key aspects of emergent hierarchical organization. This leads us to identify a new kind of structural complexity inherent in multicomponent systems and to draw out broad consequences for system ergodicity.
- North America > United States > New York (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Connecticut (0.04)
- (5 more...)
Inferring Kernel $\epsilon$-Machines: Discovering Structure in Complex Systems
Jurgens, Alexandra M., Brodu, Nicolas
Previously, we showed that computational mechanic's causal states -- predictively-equivalent trajectory classes for a stochastic dynamical system -- can be cast into a reproducing kernel Hilbert space. The result is a widely-applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal diffusion components it produces. These encode the kernel causal-state estimates as a set of coordinates in a reduced dimension space. We show how each component extracts predictive features from data and demonstrate their application on four examples: first, a simple pendulum -- an exactly solvable system; second, a molecular-dynamic trajectory of $n$-butane -- a high-dimensional system with a well-studied energy landscape; third, the monthly sunspot sequence -- the longest-running available time series of direct observations; and fourth, multi-year observations of an active crop field -- a set of heterogeneous observations of the same ecosystem taken for over a decade. In this way, we demonstrate that the empirical kernel causal-states algorithm robustly discovers predictive structures for systems with widely varying dimensionality and stochasticity.
- North America > United States > Michigan (0.14)
- North America > United States > California (0.14)
- Europe > United Kingdom > England (0.14)
Reasoning with the Theory of Mind for Pragmatic Semantic Communication
Thomas, Christo Kurisummoottil, Strinati, Emilio Calvanese, Saad, Walid
In this paper, a pragmatic semantic communication framework that enables effective goal-oriented information sharing between two-intelligent agents is proposed. In particular, semantics is defined as the causal state that encapsulates the fundamental causal relationships and dependencies among different features extracted from data. The proposed framework leverages the emerging concept in machine learning (ML) called theory of mind (ToM). It employs a dynamic two-level (wireless and semantic) feedback mechanism to continuously fine-tune neural network components at the transmitter. Thanks to the ToM, the transmitter mimics the actual mental state of the receiver's reasoning neural network operating semantic interpretation. Then, the estimated mental state at the receiver is dynamically updated thanks to the proposed dynamic two-level feedback mechanism. At the lower level, conventional channel quality metrics are used to optimize the channel encoding process based on the wireless communication channel's quality, ensuring an efficient mapping of semantic representations to a finite constellation. Additionally, a semantic feedback level is introduced, providing information on the receiver's perceived semantic effectiveness with minimal overhead. Numerical evaluations demonstrate the framework's ability to achieve efficient communication with a reduced amount of bits while maintaining the same semantics, outperforming conventional systems that do not exploit the ToM-based reasoning.
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
On Principles of Emergent Organization
Rupe, Adam T., Crutchfield, James P.
After more than a century of concerted effort, physics still lacks basic principles of spontaneous self-organization. To appreciate why, we first state the problem, outline historical approaches, and survey the present state of the physics of self-organization. This frames the particular challenges arising from mathematical intractability and the resulting need for computational approaches, as well as those arising from a chronic failure to define structure. Then, an overview of two modern mathematical formulations of organization -- intrinsic computation and evolution operators -- lays out a way to overcome these challenges. Together, the vantage point they afford shows how to account for the emergence of structured states via a statistical mechanics of systems arbitrarily far from equilibrium. The result is a constructive path forward to principles of organization that builds on mathematical identification of structure.
- North America > United States > New York (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Texas (0.14)
- (4 more...)
- Energy > Oil & Gas > Upstream (1.00)
- Government (0.67)
Unsupervised Discovery of Extreme Weather Events Using Universal Representations of Emergent Organization
Rupe, Adam, Kashinath, Karthik, Kumar, Nalini, Crutchfield, James P.
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice. Its building blocks are spacetime lightcones that embody how information propagates across a system through local interactions. We show that predictive equivalence classes of lightcones -- local causal states -- capture organized behaviors and coherent structures in complex spatiotemporal systems. Employing an unsupervised physics-informed machine learning algorithm and a high-performance computing implementation, we demonstrate automatically discovering coherent structures in two real world domain science problems. We show that local causal states identify vortices and track their power-law decay behavior in two-dimensional fluid turbulence. We then show how to detect and track familiar extreme weather events -- hurricanes and atmospheric rivers -- and discover other novel coherent structures associated with precipitation extremes in high-resolution climate data at the grid-cell level.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Kansas > Cowley County (0.05)
- (7 more...)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.93)
Measuring Shared Information and Coordinated Activity in Neuronal Networks
This activity often manifests itself as dynamically coordinated sequences of action potentials. Since multiple electrode recordings are now a standard tool in neuroscience research, it is important to have a measure of such network-wide behavioral coordination and information sharing, applicable to multiple neural spike train data. We propose a new statistic, informational coherence, which measures how much better one unit can be predicted by knowing the dynamical state of another. We argue informational coherence is a measure of association and shared information which is superior to traditional pairwise measures of synchronization and correlation. To find the dynamical states, we use a recently-introduced algorithm which reconstructs effective state spaces from stochastic time series.
Complexity-calibrated Benchmarks for Machine Learning Reveal When Next-Generation Reservoir Computer Predictions Succeed and Mislead
Marzen, Sarah E., Riechers, Paul M., Crutchfield, James P.
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ~ 60% higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines -- specifically, large $\epsilon$-machines -- are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
- Europe > Czechia > Prague (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > California > Los Angeles County > Claremont (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)