Goto

Collaborating Authors

 ecology


Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

Riva, Giulio Valentino Dalla

arXiv.org Machine Learning

We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of latent world states, the Bayes-optimal next-token cross-entropy decomposes into the irreducible conditional entropy plus a Jensen--Shannon excess term. That excess vanishes if and only if the encoding preserves the training ecology's equivalence classes. This yields a precise notion of ecological veridicality for language models and identifies the minimum-complexity zero-excess solution as the quotient partition by training equivalence. We then determine when this fixed-encoding analysis applies to transformer families: frozen dense and frozen Mixture-of-Experts transformers satisfy it, in-context learning does not enlarge the model's separation set, and per-task adaptation breaks the premise. The framework predicts two characteristic failure modes: simplicity pressure preferentially removes low-gain distinctions, and training-optimal models can still incur positive excess on deployment ecologies that refine the training ecology. A conditional dynamic extension shows how inter-model selection and post-training can recover such gap distinctions under explicit heredity, variation, and selection assumptions. Exact finite-ecology checks and controlled microgpt experiments validate the static decomposition, split-merge threshold, off-ecology failure pattern, and two-ecology rescue mechanism in a regime where the relevant quantities are directly observable. The goal is not to model frontier systems at scale, but to use small language models as laboratory organisms for theory about representational selection.


Flexible metadata harvesting for ecology using large language models

Lu, Zehao, van der Plas, Thijs L, Rashidi, Parinaz, Kissling, W Daniel, Athanasiadis, Ioannis N

arXiv.org Artificial Intelligence

Large, open datasets can accelerate ecological research, particularly by enabling researchers to develop new insights by reusing datasets from multiple sources. However, to find the most suitable datasets to combine and integrate, researchers must navigate diverse ecological and environmental data provider platforms with varying metadata availability and standards. To overcome this obstacle, we have developed a large language model (LLM)-based metadata harvester that flexibly extracts metadata from any dataset's landing page, and converts these to a user-defined, unified format using existing metadata standards. We validate that our tool is able to extract both structured and unstructured metadata with equal accuracy, aided by our LLM post-processing protocol. Furthermore, we utilise LLMs to identify links between datasets, both by calculating embedding similarity and by unifying the formats of extracted metadata to enable rule-based processing. Our tool, which flexibly links the metadata of different datasets, can therefore be used for ontology creation or graph-based queries, for example, to find relevant ecological and environmental datasets in a virtual research environment.


Simulacra Naturae: Generative Ecosystem driven by Agent-Based Simulations and Brain Organoid Collective Intelligence

Manoudaki, Nefeli, Toka, Mert, Paterakis, Iason, Flatley, Diarmid

arXiv.org Artificial Intelligence

Simulacra Naturae is a data-driven media installation that explores collective care through the entanglement of biological computation, material ecologies, and generative systems. The work translates pre-recorded neural activity from brain organoids, lab-grown three-dimensional clusters of neurons, into a multi-sensory environment composed of generative visuals, spatial audio, living plants, and fabricated clay artifacts. These biosignals, streamed through a real-time system, modulate emergent agent behaviors inspired by natural systems such as termite colonies and slime molds. Rather than using biosignals as direct control inputs, Simulacra Naturae treats organoid activity as a co-creative force, allowing neural rhythms to guide the growth, form, and atmosphere of a generative ecosystem. The installation features computationally fabricated clay prints embedded with solenoids, adding physical sound resonances to the generative surround composition. The spatial environment, filled with live tropical plants and a floor-level projection layer featuring real-time generative AI visuals, invites participants into a sensory field shaped by nonhuman cognition. By grounding abstract data in living materials and embodied experience, Simulacra Naturae reimagines visualization as a practice of care, one that decentralizes human agency and opens new spaces for ethics, empathy, and ecological attunement within hybrid computational systems.


The Robustness of Structural Features in Species Interaction Networks

Fard, Sanaz Hasanzadeh, Dolson, Emily

arXiv.org Artificial Intelligence

Species interaction networks are a powerful tool for describing ecological communities; they typically contain nodes representing species, and edges representing interactions between those species. For the purposes of drawing abstract inferences about groups of similar networks, ecologists often use graph topology metrics to summarize structural features. However, gathering the data that underlies these networks is challenging, which can lead to some interactions being missed. Thus, it is important to understand how much different structural metrics are affected by missing data. To address this question, we analyzed a database of 148 real-world bipartite networks representing four different types of species interactions (pollination, host-parasite, plant-ant, and seed-dispersal). For each network, we measured six different topological properties: number of connected components, variance in node betweenness, variance in node PageRank, largest Eigenvalue, the number of non-zero Eigenvalues, and community detection as determined by four different algorithms. We then tested how these properties change as additional edges -- representing data that may have been missed -- are added to the networks. We found substantial variation in how robust different properties were to the missing data. For example, the Clauset-Newman-Moore and Louvain community detection algorithms showed much more gradual change as edges were added than the label propagation and Girvan-Newman algorithms did, suggesting that the former are more robust. Robustness also varied for some metrics based on interaction type. These results provide a foundation for selecting network properties to use when analyzing messy ecological network data.


"Benefit Game: Alien Seaweed Swarms" -- Real-time Gamification of Digital Seaweed Ecology

Fei, Dan-Lu, Wu, Zi-Wei, Zhang, Kang

arXiv.org Artificial Intelligence

"Benefit Game: Alien Seaweed Swarms" combines artificial life art and interactive game with installation to explore the impact of human activity on fragile seaweed ecosystems. The project aims to promote ecological consciousness by creating a balance in digital seaweed ecologies. Inspired by the real species "Laminaria saccharina", the author employs Procedural Content Generation via Machine Learning technology to generate variations of virtual seaweeds and symbiotic fungi. The audience can explore the consequences of human activities through gameplay and observe the ecosystem's feedback on the benefits and risks of seaweed aquaculture. This Benefit Game offers dynamic and real-time responsive artificial seaweed ecosystems for an interactive experience that enhances ecological consciousness.


A survey to measure cognitive biases influencing mobility choices

Adam, Carole

arXiv.org Artificial Intelligence

Mobility is a central issue in the transition to a more sustainable lifestyle. The average daily distance traveled by the French population has increased considerably, from 5 km on average in the 1950s to 45 km on average in 2011 [58], as has the number of personal cars (11,860 million cars in 1970 [7] compared to 38,3 million in 2021 [15, 28]). For example in Toulouse, cars concentrate 74% of the distances traveled by the inhabitants and contribute up to 88% to GHG emissions [25]. The evolution of mobility is therefore an essential question, both for the global climate crisis and for public health: negative impact of a sedentary lifestyle [9], road accidents, air and sound pollution [44]. Indeed, 40000 deaths per year are attributable to exposure to fine particles (PM2.5) and 7000 deaths per year attributable to exposure to nitrogen dioxide (NO2), i.e. 7% and 1% of the total annual mortality [38]; the 2-month lockdown of spring 2020 in France saved 2300 deaths by reducing exposure to particles, and 1200 more deaths by reducing exposure to nitrogen dioxide [38].


The Problem With em Dune: Part Two /em

Slate

I have questions about Denis Villeneuve's Dune: Part Two. If the Fremen have lasers, why don't they just shoot the sand harvesters and run away? Why don't they use their sandworms until the last battle? Wouldn't it make more sense to fight the other great houses on Arrakis itself, where they have sandworms, rather than board ships off-world to go off to war? If Paul (Timothée Chalamet) has to invade the galaxy at the end, why bother marrying the daughter of the emperor he just deposed?


Designing Multispecies Worlds for Robots, Cats, and Humans

Schneiders, Eike, Benford, Steve, Chamberlain, Alan, Mancini, Clara, Castle-Green, Simon, Ngo, Victor, Farr, Ju Row, Adams, Matt, Tandavanitj, Nick, Fischer, Joel

arXiv.org Artificial Intelligence

We reflect on the design of a multispecies world centred around a bespoke enclosure in which three cats and a robot arm coexist for six hours a day during a twelve-day installation as part of an artist-led project. In this paper, we present the project's design process, encompassing various interconnected components, including the cats, the robot and its autonomous systems, the custom end-effectors and robot attachments, the diverse roles of the humans-in-the-loop, and the custom-designed enclosure. Subsequently, we provide a detailed account of key moments during the deployment and discuss the design implications for future multispecies systems. Specifically, we argue that designing the technology and its interactions is not sufficient, but that it is equally important to consider the design of the `world' in which the technology operates. Finally, we highlight the necessity of human involvement in areas such as breakdown recovery, animal welfare, and their role as audience.


Astrobiologists train an AI to find life on Mars

#artificialintelligence

Artificial intelligence (AI) and machine learning could revolutionize the search for life on other planets. But before these tools can tackle distant locales such as Mars, they need to be tested here on Earth. A team of researchers have successfully trained an AI to map biosignatures -- any feature which provides evidence of past or present life -- in a three-square-kilometre area of Chile's Atacama Desert. The AI substantially reduced the area the team needed to search and boosted the likelihood of finding living organisms in one of the driest places on the planet. The results were reported on 6 March in Nature Astronomy1.


Simulating the impact of cognitive biases on the mobility transition

Adam, Carole

arXiv.org Artificial Intelligence

In recent decades, the average daily distance traveled by the French population has increased considerably (from 5 km on average in the 1950s to 45 km on average in 2011 [33]), as has the number of personal cars (11,860 million cars in 1970 [5] compared to 38,3 million in 2021 [9, 19]). For example in Toulouse, cars concentrate 74% of the distances traveled by the inhabitants and contribute up to 88% to GHG emissions [30]. The evolution of mobility is therefore an essential question, in the context of the climate crisis but also in terms of public health: the negative impact of a sedentary lifestyle [6], road accidents, air pollution and sound pollution [28]. Indeed, 40000 deaths per year are attributable to exposure to fine particles (PM2.5) and 7000 deaths per year attributable to exposure to nitrogen dioxide (NO2), i.e. 7% and 1% of the total annual mortality [16]; this report also concludes that the 2-month lockdown of spring 2020 in France made it possible to avoid 2300 deaths by reducing exposure to particles, and 1200 more deaths by reducing exposure to nitrogen dioxide. This shows that public policies and individual behaviour changes (modal shift towards cycling, more extensive teleworking) can have an impact on public health.