Goto

Collaborating Authors

 temporal score


Scaling and context steer LLMs along the same computational path as the human brain

arXiv.org Artificial Intelligence

Recent studies suggest that the representations learned by large language models (LLMs) are partially aligned to those of the human brain. However, whether and why this alignment score arises from a similar sequence of computations remains elusive. In this study, we explore this question by examining temporally-resolved brain signals of participants listening to 10 hours of an audiobook. We study these neural dynamics jointly with a benchmark encompassing 22 LLMs varying in size and architecture type. Our analyses confirm that LLMs and the brain generate representations in a similar order: specifically, activations in the initial layers of LLMs tend to best align with early brain responses, while the deeper layers of LLMs tend to best align with later brain responses. This brain-LLM alignment is consistent across transformers and recurrent architectures. However, its emergence depends on both model size and context length. Overall, this study sheds light on the sequential nature of computations and the factors underlying the partial convergence between biological and artificial neural networks.


Disentangling the Factors of Convergence between Brains and Computer Vision Models

arXiv.org Artificial Intelligence

Many AI models trained on natural images develop representations that resemble those of the human brain. However, the factors that drive this brain-model similarity remain poorly understood. To disentangle how the model, training and data independently lead a neural network to develop brain-like representations, we trained a family of self-supervised vision transformers (DINOv3) that systematically varied these different factors. We compare their representations of images to those of the human brain recorded with both fMRI and MEG, providing high resolution in spatial and temporal analyses. We assess the brain-model similarity with three complementary metrics focusing on overall representational similarity, topographical organization, and temporal dynamics. We show that all three factors - model size, training amount, and image type - independently and interactively impact each of these brain similarity metrics. In particular, the largest DINOv3 models trained with the most human-centric images reach the highest brain-similarity. This emergence of brain-like representations in AI models follows a specific chronology during training: models first align with the early representations of the sensory cortices, and only align with the late and prefrontal representations of the brain with considerably more training. Finally, this developmental trajectory is indexed by both structural and functional properties of the human cortex: the representations that are acquired last by the models specifically align with the cortical areas with the largest developmental expansion, thickness, least myelination, and slowest timescales. Overall, these findings disentangle the interplay between architecture and experience in shaping how artificial neural networks come to see the world as humans do, thus offering a promising framework to understand how the human brain comes to represent its visual world.


Spatio-Temporal Consistency as a Means to Identify Unlabeled Objects in a Continuous Data Field

AAAI Conferences

Mesoscale ocean eddies are a critical component of the Earth System as they dominate the ocean's kinetic energy and impact the global distribution of oceanic heat, salinity, momentum, and nutrients. Therefore, accurately representing these dynamic features is critical for our planet's sustainability. The majority of methods that identify eddies from satellite observations analyze the data in a frame-by-frame basis despite the fact that eddies are dynamic objects that propagate across space and time. We introduce the notion of spatio-temporal consistency to identify eddies in a continuous spatio-temporal field, to simultaneously ensure that the features detected are both spatially and temporally consistent. Our spatio-temporal consistency approach allows us to remove most of the expert criteria used in traditional methods to reduce false negatives. The removal of arbitrary heuristics enables us to render more complete eddy dynamics by identifying smaller and longer lived eddies compared to existing methods.