evolvability
Large Language Models as symbolic DNA of cultural dynamics
Pourdavood, Parham, Jacob, Michael, Deacon, Terrence
Although the recent wave of AI models, known as Large Language Models (LLMs), are seamlessly surpassing the Turing Test, this milestone has been overshadowed by their rapid commercialization and the profound ways they are already reshaping society. The pursuit of Artificial General Intelligence (AGI)--commonly defined as human-level intelligence--is touted as the next major milestone. Yet whether the continued progress within the current framework could ever lead to agency and meaning at the scale of AI itself remains an open and contested question. Critics argue that current LLMs operate through algorithmic mimicry, that is simulating intelligent behavior without embodying the principles behind it (Jaeger, 2024; Jaeger et al., 2024) . Artificial Neural Networks--the main framework behind LLMs--operate on behaviorist assumptions: a framework that focuses exclusively on observable input-output patterns while treating internal states as part of a "black box" to be optimized (Brooks, 1991; Sutton & Barto, 2015) . This does not mean LLMs do not have sophisticated engineering, but their structure is designed to optimize internal states based on input-output feedback loops. Even though the logic behind behaviorism is likely one of the key principles supporting an intelligent system, it likely is not sufficient for intelligence and is not what enables agency and intelligence in the first place (Dreyfus, 1992; Searle, 1980) . Furthermore, it would be naive to consider outward behavior of intelligence as having acquired intelligence or sentience since a good simulation can be powerful and convincing. To address such issues, alternative approaches grounded in organismal intelligence are emerging to instead explain the principles behind intelligence through intrinsic and goal-directed models of the body and its relationship to the environment (Deacon, 2012; Jacob, 2023; Jaeger et al., 2024; Levin, 2019; Roli et al., 2022; Varela et al., 1993; Watson, 2024) .
Review for NeurIPS paper: Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability
In this work, a vastly simpler proper halfspace learning algorithm is proposed, as well as algorithms for learning generalized linear models under Massart noise. These algorithms can only guarantee accuracy up to the (maximum) noise rate. This work also presents a connection between learning under Massart noise and (correlational) statistical query algorithms/evolvability, and shows that no distribution-independent algorithm that makes only a polynomial number of queries can match the error rate of the optimal halfspace (which may be lower than the noise rate). Finally, some experiments are presented, demonstrating that different (Massart-style) noise rates on different populations can lead standard classifiers to produce different error rates across populations, but that the algorithm presented here (along with the uninterpretable random forest) is resilient to this effect.
Review for NeurIPS paper: Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability
This paper makes a solid contribution to the literature on efficiently learning halfspaces with noise, extending a paper from last year on learning with Massart noise by proposing a simpler and proper learning algorithm. It also generalizes beyond Massart noise to some extent, and establishes a lower bound for SQ learning. The reviewers are unanimous in their praise of the paper.
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability
In this paper, we revisit the problem of distribution-independently learning halfspaces under Massart noise with rate \eta . Recent work resolved a long-standing problem in this model of efficiently learning to error \eta \epsilon for any \epsilon 0, by giving an improper learner that partitions space into \text{poly}(d,1/\epsilon) regions. Here we give a much simpler algorithm and settle a number of outstanding open questions: (1) We give the first \emph{proper} learner for Massart halfspaces that achieves \eta \epsilon . We then zoom out to study generalized linear models and give an efficient algorithm for learning under a challenging new corruption model generalizing Massart noise. Finally we study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties as a byproduct of its strong provable robustness guarantees.
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models
Chen, Yuheng, Cao, Pengfei, Chen, Yubo, Wang, Yining, Liu, Shengping, Liu, Kang, Zhao, Jun
Large language models (LLMs) store extensive factual knowledge, but the underlying mechanisms remain unclear. Previous research suggests that factual knowledge is stored within multi-layer perceptron weights, and some storage units exhibit degeneracy, referred to as Degenerate Knowledge Neurons (DKNs). Despite the novelty and unique properties of this concept, it has not been rigorously defined or systematically studied. We first consider the connection weight patterns of MLP neurons and define DKNs from both structural and functional aspects. Based on this, we introduce the Neurological Topology Clustering method, which allows the formation of DKNs in any numbers and structures, leading to a more accurate DKN acquisition. Furthermore, inspired by cognitive science, we explore the relationship between DKNs and the robustness, evolvability, and complexity of LLMs. Our execution of 34 experiments under 6 settings demonstrates the connection between DKNs and these three properties. The code will be available soon.
Biomaker CA: a Biome Maker project using Cellular Automata
Randazzo, Ettore, Mordvintsev, Alexander
We introduce Biomaker CA: a Biome Maker project using Cellular Automata (CA). In Biomaker CA, morphogenesis is a first class citizen and small seeds need to grow into plant-like organisms to survive in a nutrient starved environment and eventually reproduce with variation so that a biome survives for long timelines. We simulate complex biomes by means of CA rules in 2D grids and parallelize all of its computation on GPUs through the Python JAX framework. We show how this project allows for several different kinds of environments and laws of 'physics', alongside different model architectures and mutation strategies. We further analyze some configurations to show how plant agents can grow, survive, reproduce, and evolve, forming stable and unstable biomes. We then demonstrate how one can meta-evolve models to survive in a harsh environment either through end-to-end meta-evolution or by a more surgical and efficient approach, called Petri dish meta-evolution. Finally, we show how to perform interactive evolution, where the user decides how to evolve a plant model interactively and then deploys it in a larger environment. We open source Biomaker CA at: https://tinyurl.com/2x8yu34s .
Evolving Complexity is Hard
Wright, Alden H., Laue, Cheyenne L.
Understanding the evolution of complexity is an important topic in a wide variety of academic fields. Implications of better understanding complexity include increased knowledge of major evolutionary transitions and the properties of living and technological systems. Genotype-phenotype (G-P) maps are fundamental to evolution, and biologically-oriented G-P maps have been shown to have interesting and often-universal properties that enable evolution by following phenotype-preserving walks in genotype space. Here we use a digital logic gate circuit G-P map where genotypes are represented by circuits and phenotypes by the functions that the circuits compute. We compare two mathematical definitions of circuit and phenotype complexity and show how these definitions relate to other well-known properties of evolution such as redundancy, robustness, and evolvability. Using both Cartesian and Linear genetic programming implementations, we demonstrate that the logic gate circuit shares many universal properties of biologically derived G-P maps, with the exception of the relationship between one method of computing phenotypic evolvability, robustness, and complexity. Due to the inherent structure of the G-P map, including the predominance of rare phenotypes, large interconnected neutral networks, and the high mutational load of low robustness, complex phenotypes are difficult to discover using evolution. We suggest, based on this evidence, that evolving complexity is hard and we discuss computational strategies for genetic-programming-based evolution to successfully find genotypes that map to complex phenotypes in the search space.
AI Neural Network Predicts Evolution of Gene Regulation in Yeast
Scientists have developed an artificial intelligence (AI)-based neural network model that accurately predicts gene expression in yeast. The team has validated the ability of its neural network in high-throughput experiments and the work opens doors for a broad spectrum of scientific questions. The model can help design genes with customized levels of expression for the development of gene therapies or industrial applications and clarify evolutionary mechanisms that regulate gene expression. The findings were published in the journal Nature, in an article titled, "The evolution, evolvability, and engineering of gene regulatory DNA." "This work highlights what possibilities open up when we design new kinds of experiments to generate the right data to train models," said Aviv Regev, PhD, a professor of biology at MIT, core member of the Broad Institute of Harvard and MIT, head of Genentech Research and Early Development, and the senior author of the study. The researchers employed two key technologies to predict gene expression in the yeast Saccharomyces cerevisiae.
Quality Evolvability ES: Evolving Individuals With a Distribution of Well Performing and Diverse Offspring
Katona, Adam, Franks, Daniel W., Walker, James Alfred
One of the most important lessons from the success of deep learning is that learned representations tend to perform much better at any task compared to representations we design by hand. Yet evolution of evolvability algorithms, which aim to automatically learn good genetic representations, have received relatively little attention, perhaps because of the large amount of computational power they require. The recent method Evolvability ES allows direct selection for evolvability with little computation. However, it can only be used to solve problems where evolvability and task performance are aligned. We propose Quality Evolvability ES, a method that simultaneously optimizes for task performance and evolvability and without this restriction. Our proposed approach Quality Evolvability has similar motivation to Quality Diversity algorithms, but with some important differences. While Quality Diversity aims to find an archive of diverse and well-performing, but potentially genetically distant individuals, Quality Evolvability aims to find a single individual with a diverse and well-performing distribution of offspring. By doing so Quality Evolvability is forced to discover more evolvable representations. We demonstrate on robotic locomotion control tasks that Quality Evolvability ES, similarly to Quality Diversity methods, can learn faster than objective-based methods and can handle deceptive problems.
On the Evolvability of Monotone Conjunctions with an Evolutionary Mutation Mechanism
Valiant (2009) introduced a framework for a quantitative approach to evolution, called evolvability. The idea is, roughly, that there is an ideal behavior in every environment and the feedback that the various organisms receive during evolution indicates how close their behavior is to ideal. Ultimately, evolvability aims at modeling and explaining mechanisms that allow near-optimal behavior of organisms while exploiting realistic computational resources. Due to a result by Feldman (2008), evolvability is equivalent to learning in the correlational statistical query (CSQ) model (Bshouty & Feldman, 2002). Thus, evolvability algorithms correspond to a special type of local search learning algorithms that fall under the umbrella of the probably approximately correct (PAC) model of learning (Valiant, 1984).