Goto

Collaborating Authors

 ablated model




Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

arXiv.org Artificial Intelligence

Agent-Based Models (ABMs) are powerful tools for studying emergent properties in complex systems. In ABMs, agent behaviors are governed by local interactions and stochastic rules. However, these rules are, in general, non-differentiable, limiting the use of gradient-based methods for optimization, and thus integration with real-world data. We propose a novel framework to learn a differentiable surrogate of any ABM by observing its generated data. Our method combines diffusion models to capture behavioral stochasticity and graph neural networks to model agent interactions. Distinct from prior surrogate approaches, our method introduces a fundamental shift: rather than approximating system-level outputs, it models individual agent behavior directly, preserving the decentralized, bottom-up dynamics that define ABMs. We validate our approach on two ABMs (Schelling's segregation model and a Predator-Prey ecosystem) showing that it replicates individual-level patterns and accurately forecasts emergent dynamics beyond training. Our results demonstrate the potential of combining diffusion models and graph learning for data-driven ABM simulation.




b139e104214a08ae3f2ebcce149cdf6e-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their positive and insightful feedback as well as the research ideas for future work (e.g. the LSTM experiments suggested by R4). All minor comments will be addressed in the revised paper. Stimulus is shown in black. Comparing the influence of training and model structure is non trivial. We will stress this in the revised paper.


A Neural Model for Word Repetition

arXiv.org Artificial Intelligence

It takes several years for the developing brain of a baby to fully master word repetition -- the task of hearing a word and repeating it aloud. Repeating a new word, such as from a new language, can be a challenging task also for adults. Additionally, brain damage, such as from a stroke, may lead to systematic speech errors with specific characteristics dependent on the location of the brain damage. Cognitive sciences suggest a model with various components for the different processing stages involved in word repetition. While some studies have begun to localize the corresponding regions in the brain, the neural mechanisms and how exactly the brain performs word repetition remain largely unknown. We propose to bridge the gap between the cognitive model of word repetition and neural mechanisms in the human brain by modeling the task using deep neural networks. Neural models are fully observable, allowing us to study the detailed mechanisms in their various substructures and make comparisons with human behavior and, ultimately, the brain. Here, we make first steps in this direction by: (1) training a large set of models to simulate the word repetition task; (2) creating a battery of tests to probe the models for known effects from behavioral studies in humans, and (3) simulating brain damage through ablation studies, where we systematically remove neurons from the model, and repeat the behavioral study to examine the resulting speech errors in the "patient" model. Our results show that neural models can mimic several effects known from human research, but might diverge in other aspects, highlighting both the potential and the challenges for future research aimed at developing human-like neural models.


Self-Ablating Transformers: More Interpretability, Less Sparsity

arXiv.org Artificial Intelligence

A growing intuition in machine learning suggests a link between sparsity and in-terpretability. We introduce a novel self-ablation mechanism to investigate this connection ante-hoc in the context of language transformers. Our approach dynamically enforces a k-winner-takes-all constraint, forcing the model to demonstrate selective activation across neuron and attention units. Unlike post-hoc methods that analyze already-trained models, our approach integrates interpretabil-ity directly into model training, promoting feature localization from inception. Training small models on the TinyStories dataset and employing interpretabil-ity tests, we find that self-ablation leads to more localized circuits, concentrated feature representations, and increased neuron specialization without compromising language modelling performance. Surprisingly, our method also decreased overall sparsity, indicating that self-ablation promotes specialization rather than widespread inactivity. This reveals a complex interplay between sparsity and in-terpretability, where decreased global sparsity can coexist with increased local specialization, leading to enhanced interpretability. To facilitate reproducibility, we make our code available at https://github.com/keenanpepper/ As machine learning systems are entrusted with increasingly complex tasks, our ability to understand their decision-making processes lags behind their growing capabilities (OpenAI, 2024; Gemini Team, Google, 2024). Much of the current research in interpretability for LLMs focuses on developing post-hoc methods, attempting to explain the behaviour of already-trained models (Ribeiro et al., 2016; Conmy et al., 2023; Foote et al., 2023b; Bills et al., 2023; Huben et al., 2024). While valuable, these approaches often provide only an approximate or incomplete understanding of the underlying mechanisms (Rudin, 2019). A more fundamental yet less studied approach involves designing models to be inherently more interpretable, an ante-hoc approach, where transparency is woven into the architecture itself (Slavin et al., 2018; Tamkin et al., 2023; Cloud et al., 2024; Liu et al., 2024).


Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding

arXiv.org Artificial Intelligence

Current natural language understanding (NLU) models have been continuously scaling up, both in terms of model size and input context, introducing more hidden and input neurons. While this generally improves performance on average, the extra neurons do not yield a consistent improvement for all instances. This is because some hidden neurons are redundant, and the noise mixed in input neurons tends to distract the model. Previous work mainly focuses on extrinsically reducing low-utility neurons by additional post- or pre-processing, such as network pruning and context selection, to avoid this problem. Beyond that, can we make the model reduce redundant parameters and suppress input noise by intrinsically enhancing the utility of each neuron? If a model can efficiently utilize neurons, no matter which neurons are ablated (disabled), the ablated submodel should perform no better than the original full model. Based on such a comparison principle between models, we propose a cross-model comparative loss for a broad range of tasks. Comparative loss is essentially a ranking loss on top of the task-specific losses of the full and ablated models, with the expectation that the task-specific loss of the full model is minimal. We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks based on 4 widely used pretrained language models, and find it particularly superior for models with few parameters or long input.