crutchfield
From monoliths to modules: Decomposing transducers for efficient world modelling
Boyd, Alexander, Nowak, Franz, Hyland, David, Baltieri, Manuel, Rosas, Fernando E.
World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although realistic world models often have high computational demands, efficient modelling is usually possible by exploiting the fact that real-world scenarios tend to involve subcomponents that interact in a modular manner. In this paper, we explore this idea by developing a framework for decomposing complex world models represented by transducers, a class of models gen-eralising POMDPs. Whereas the composition of transducers is well understood, our results clarify how to invert this process deriving sub-transducers operating on distinct input-output subspaces, enabling parallelizable and interpretable alternatives to monolithic world modelling that can support distributed inference. Overall, these results lay a groundwork for bridging the structural transparency demanded by AI safety and the computational efficiency required for real-world inference.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (6 more...)
Optimal Computation from Fluctuation Responses
Lyu, Jinghao, Ray, Kyle J., Crutchfield, James P.
The energy cost of computation has emerged as a central challenge at the intersection of physics and computer science. Recent advances in statistical physics -- particularly in stochastic thermodynamics -- enable precise characterizations of work, heat, and entropy production in information-processing systems driven far from equilibrium by time-dependent control protocols. A key open question is then how to design protocols that minimize thermodynamic cost while ensur- ing correct outcomes. To this end, we develop a unified framework to identify optimal protocols using fluctuation response relations (FRR) and machine learning. Unlike previous approaches that optimize either distributions or protocols separately, our method unifies both using FRR-derived gradients. Moreover, our method is based primarily on iteratively learning from sampled noisy trajectories, which is generally much easier than solving for the optimal protocol directly from a set of governing equations. We apply the framework to canonical examples -- bit erasure in a double-well potential and translating harmonic traps -- demonstrating how to construct loss functions that trade-off energy cost against task error. The framework extends trivially to underdamped systems, and we show this by optimizing a bit-flip in an underdamped system. In all computations we test, the framework achieves the theoretically optimal protocol or achieves work costs comparable to relevant finite time bounds. In short, the results provide principled strategies for designing thermodynamically efficient protocols in physical information-processing systems. Applications range from quantum gates robust under noise to energy-efficient control of chemical and synthetic biological networks.
- North America > United States > California > Yolo County > Davis (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Agentic Information Theory: Ergodicity and Intrinsic Semantics of Information Processes
Crutchfield, James P., Jurgens, Alexandra
We develop information theory for the temporal behavior of memoryful agents moving through complex -- structured, stochastic -- environments. We introduce and explore information processes -- stochastic processes produced by cognitive agents in real-time as they interact with and interpret incoming stimuli. We provide basic results on the ergodicity and semantics of the resulting time series of Shannon information measures that monitor an agent's adapting view of uncertainty and structural correlation in its environment.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (8 more...)
Way More Than the Sum of Their Parts: From Statistical to Structural Mixtures
We show that mixtures comprised of multicomponent systems typically are much more structurally complex than the sum of their parts; sometimes, infinitely more complex. We contrast this with the more familiar notion of statistical mixtures, demonstrating how statistical mixtures miss key aspects of emergent hierarchical organization. This leads us to identify a new kind of structural complexity inherent in multicomponent systems and to draw out broad consequences for system ergodicity.
- North America > United States > New York (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Connecticut (0.04)
- (5 more...)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
A nice advantage of predictive representations of stochastic processes is that they can be expressed in terms of families of linear operators --- the "observable operators" of Jaeger (oddly, not cited in this paper; also, see Upper, and the appendix to Shalizi and Crutchfield). This paper proposes (following some earlier work) to exploit this fact, by using the instrumental variables technique from econometrics to simplify the estimation of such models. Doing so results in an estimation procedure very similar to that of Langford et al. from 2009 (reference [16] in the paper), but with some advantages in terms of avoiding iterative re-estimation. However, there seems to be an important issue which isn't (that I saw) addressed here. The instrumental variable needs to be correlated with the input variable to the regression, but independent of the noise in the regression.
Inferring Kernel $\epsilon$-Machines: Discovering Structure in Complex Systems
Jurgens, Alexandra M., Brodu, Nicolas
Previously, we showed that computational mechanic's causal states -- predictively-equivalent trajectory classes for a stochastic dynamical system -- can be cast into a reproducing kernel Hilbert space. The result is a widely-applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal diffusion components it produces. These encode the kernel causal-state estimates as a set of coordinates in a reduced dimension space. We show how each component extracts predictive features from data and demonstrate their application on four examples: first, a simple pendulum -- an exactly solvable system; second, a molecular-dynamic trajectory of $n$-butane -- a high-dimensional system with a well-studied energy landscape; third, the monthly sunspot sequence -- the longest-running available time series of direct observations; and fourth, multi-year observations of an active crop field -- a set of heterogeneous observations of the same ecosystem taken for over a decade. In this way, we demonstrate that the empirical kernel causal-states algorithm robustly discovers predictive structures for systems with widely varying dimensionality and stochasticity.
- North America > United States > Michigan (0.14)
- North America > United States > California (0.14)
- Europe > United Kingdom > England (0.14)
Complexity-calibrated Benchmarks for Machine Learning Reveal When Next-Generation Reservoir Computer Predictions Succeed and Mislead
Marzen, Sarah E., Riechers, Paul M., Crutchfield, James P.
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ~ 60% higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines -- specifically, large $\epsilon$-machines -- are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
- Europe > Czechia > Prague (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > California > Los Angeles County > Claremont (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
Quantum adaptive agents with efficient long-term memories
Elliott, Thomas J., Gu, Mile, Garner, Andrew J. P., Thompson, Jayne
Central to the success of adaptive systems is their ability to interpret signals from their environment and respond accordingly -- they act as agents interacting with their surroundings. Such agents typically perform better when able to execute increasingly complex strategies. This comes with a cost: the more information the agent must recall from its past experiences, the more memory it will need. Here we investigate the power of agents capable of quantum information processing. We uncover the most general form a quantum agent need adopt to maximise memory compression advantages, and provide a systematic means of encoding their memory states. We show these encodings can exhibit extremely favourable scaling advantages relative to memory-minimal classical agents when information must be retained about events increasingly far into the past.
- Europe > United Kingdom (0.14)
- Asia > Singapore (0.04)
- North America > United States > California (0.04)
- (3 more...)
- Research Report (0.64)
- Workflow (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Visualizing computation in large-scale cellular automata
Cisneros, Hugo, Sivic, Josef, Mikolov, Tomas
Emergent processes in complex systems such as cellular automata can perform computations of increasing complexity, and could possibly lead to artificial evolution. Such a feat would require scaling up current simulation sizes to allow for enough computational capacity. Understanding complex computations happening in cellular automata and other systems capable of emergence poses many challenges, especially in large-scale systems. We propose methods for coarse-graining cellular automata based on frequency analysis of cell states, clustering and autoencoders. These innovative techniques facilitate the discovery of large-scale structure formation and complexity analysis in those systems. They emphasize interesting behaviors in elementary cellular automata while filtering out background patterns. Moreover, our methods reduce large 2D automata to smaller sizes and enable identifying systems that behave interestingly at multiple scales.
Discovering Causal Structure with Reproducing-Kernel Hilbert Space $\epsilon$-Machines
Brodu, Nicolas, Crutchfield, James P.
We merge computational mechanics' definition of causal states (predictively-equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely-applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation -- a finite- or infinite-state kernel $\epsilon$-machine -- is extracted by a reduced-dimension transform that gives an efficient representation of causal states and their topology. In this way, the system dynamics are represented by a stochastic (ordinary or partial) differential equation that acts on causal states. We introduce an algorithm to estimate the associated evolution operator. Paralleling the Fokker-Plank equation, it efficiently evolves causal-state distributions and makes predictions in the original data space via an RKHS functional mapping. We demonstrate these techniques, together with their predictive abilities, on discrete-time, discrete-value infinite Markov-order processes generated by finite-state hidden Markov models with (i) finite or (ii) uncountably-infinite causal states and (iii) a continuous-time, continuous-value process generated by a thermally-driven chaotic flow. The method robustly estimates causal structure in the presence of varying external and measurement noise levels.
- North America > United States > New York (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- (4 more...)