pereira
Sutton's predictions v Football Manager, the game
Is this the beginning of the rise of the machines? Last week saw AI triumph outright for the first time in our Premier League predictions battle, although BBC Sport's own expert human, Chris Sutton, points out he is still top of the overall league this season. This time, as well as AI, Sutton faces a different hi-tech challenger. His guest opponent for this weekend's fixtures is a computer game: Football Manager 26, the latest edition of the long-running management simulation series. The FM26 game engine has played out this weekend's matches and you can see its results, including goalscorers and red cards, below. Miles Jacobson, the studio director of Sports Interactive, the company behind Football Manager, also joined in.
Modeling the language cortex with form-independent and enriched representations of sentence meaning reveals remarkable semantic abstractness
Saha, Shreya, Li, Shurui, Tuckute, Greta, Li, Yuanning, Zhang, Ru-Yuan, Wehbe, Leila, Fedorenko, Evelina, Khosla, Meenakshi
The human language system represents both linguistic forms and meanings, but the abstractness of the meaning representations remains debated. Here, we searched for abstract representations of meaning in the language cortex by modeling neural responses to sentences using representations from vision and language models. When we generate images corresponding to sentences and extract vision model embeddings, we find that aggregating across multiple generated images yields increasingly accurate predictions of language cortex responses--sometimes rivaling large language models. Similarly, averaging embeddings across multiple paraphrases of a sentence improves prediction accuracy compared to any single paraphrase. Enriching paraphrases with contextual details that may be implicit (e.g., augmenting "I had a pancake" to include details like "maple syrup") further increases prediction accuracy, even surpassing predictions based on the embedding of the original sentence, suggesting that the language system maintains richer and broader semantic representations than language models. Together, these results demonstrate the existence of highly abstract, form-independent meaning representations within the language cortex.
Landscape analysis of an improved power method for tensor decomposition
In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tensor when the input is sufficiently low-rank. We analyze the non-convex optimization landscape associated with the SPM objective. We derive quantitative bounds such that any second-order critical point with SPM objective value exceeding the bound must equal a tensor component in the noiseless case, and must approximate a tensor component in the noisy case. For decomposing tensors of size D {\times m}, we obtain a near-global guarantee up to rank \widetilde{o}(D {\lfloor m/2 \rfloor}) under a random tensor model, and a global guarantee up to rank \mathcal{O}(D) assuming deterministic frame conditions.
The Hottest Startups in Lisbon in 2024
Two years ago, Jon Fath moved with his family to Portugal from the Netherlands with the sole purpose of launching a fintech startup there. "This country is brimming with talent and ambition," Fath says. "I thank Lisbon for welcoming me, along with so many other expats and entrepreneurs, so warmly." Indeed, it's no surprise that the European Commission named Lisbon as 2023's European Capital of Innovation, while the Financial Times, in partnership with Statista, ranked two Portuguese startup hubs in Europe's top ten startup hubs--including the Unicorn Factory Lisboa, which launched in 2022 and has already supported more than 820 startups and helped raise more than 1 billion ( 1.1 billion) . "Portugal offers unique advantages, such as its climate, safety, and cost of living, which make it an attractive choice over countries in central or northern Europe," says Nuno Pereira, CEO of Paynest.
What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores
Feghhi, Ebrahim, Hadidi, Nima, Song, Bryan, Blank, Idan A., Kao, Jonathan C.
Given the remarkable capabilities of large language models (LLMs), there has been a growing interest in evaluating their similarity to the human brain. One approach towards quantifying this similarity is by measuring how well a model predicts neural signals, also called "brain score". Internal representations from LLMs achieve state-of-the-art brain scores, leading to speculation that they share computational principles with human language processing. This inference is only valid if the subset of neural activity predicted by LLMs reflects core elements of language processing. Here, we question this assumption by analyzing three neural datasets used in an impactful study on LLM-to-brain mappings, with a particular focus on an fMRI dataset where participants read short passages. We first find that when using shuffled train-test splits, as done in previous studies with these datasets, a trivial feature that encodes temporal autocorrelation not only outperforms LLMs but also accounts for the majority of neural variance that LLMs explain. We therefore use contiguous splits moving forward. Second, we explain the surprisingly high brain scores of untrained LLMs by showing they do not account for additional neural variance beyond two simple features: sentence length and sentence position. This undermines evidence used to claim that the transformer architecture biases computations to be more brain-like. Third, we find that brain scores of trained LLMs on this dataset can largely be explained by sentence length, position, and pronoun-dereferenced static word embeddings; a small, additional amount is explained by sense-specific embeddings and contextual representations of sentence structure. We conclude that over-reliance on brain scores can lead to over-interpretations of similarity between LLMs and brains, and emphasize the importance of deconstructing what LLMs are mapping to in neural signals.
Goal Recognition via Linear Programming
Meneguzzi, Felipe, Santos, Luísa R. de A., Pereira, Ramon Fraga, Pereira, André G.
Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approaches. In this article, we design novel recognition approaches that rely on the Operator-Counting framework, proposing new constraints, and analyze their constraints' properties both theoretically and empirically. The Operator-Counting framework is a technique that efficiently computes heuristic estimates of cost-to-goal using Integer/Linear Programming (IP/LP). In the realm of theory, we prove that the new constraints provide lower bounds on the cost of plans that comply with observations. We also provide an extensive empirical evaluation to assess how the new constraints improve the quality of the solution, and we found that they are especially informed in deciding which goals are unlikely to be part of the solution. Our novel recognition approaches have two pivotal advantages: first, they employ new IP/LP constraints for efficiently recognizing goals; second, we show how the new IP/LP constraints can improve the recognition of goals under both partial and noisy observability.
Policy-Space Search: Equivalences, Improvements, and Compression
Messa, Frederico, Pereira, André Grahl
Fully-observable non-deterministic (FOND) planning is at the core of artificial intelligence planning with uncertainty. It models uncertainty through actions with non-deterministic effects. A* with Non-Determinism (AND*) (Messa and Pereira, 2023) is a FOND planner that generalizes A* (Hart et al., 1968) for FOND planning. It searches for a solution policy by performing an explicit heuristic search on the policy space of the FOND task. In this paper, we study and improve the performance of the policy-space search performed by AND*. We present a polynomial-time procedure that constructs a solution policy given just the set of states that should be mapped. This procedure, together with a better understanding of the structure of FOND policies, allows us to present three concepts of equivalences between policies. We use policy equivalences to prune part of the policy search space, making AND* substantially more effective in solving FOND tasks. We also study the impact of taking into account structural state-space symmetries to strengthen the detection of equivalence policies and the impact of performing the search with satisficing techniques. We apply a recent technique from the group theory literature to better compute structural state-space symmetries. Finally, we present a solution compressor that, given a policy defined over complete states, finds a policy that unambiguously represents it using the minimum number of partial states. AND* with the introduced techniques generates, on average, two orders of magnitude fewer policies to solve FOND tasks. These techniques allow explicit policy-space search to be competitive in terms of both coverage and solution compactness with other state-of-the-art FOND planners.
Query Augmentation by Decoding Semantics from Brain Signals
Ye, Ziyi, Zhan, Jingtao, Ai, Qingyao, Liu, Yiqun, de Rijke, Maarten, Lioma, Christina, Ruotsalo, Tuukka
Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorporating semantic information decoded from brain signals. BrainAug generates the continuation of the original query with a prompt constructed with brain signal information and a ranking-oriented inference approach. Experimental results on fMRI (functional magnetic resonance imaging) datasets show that Brain-Aug produces semantically more accurate queries, leading to improved document ranking performance. Such improvement brought by brain signals is particularly notable for ambiguous queries.
Language Generation from Brain Recordings
Ye, Ziyi, Ai, Qingyao, Liu, Yiqun, Zhang, Min, Lioma, Christina, Ruotsalo, Tuukka
Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical semantic representation. Inspired by recent research that revealed associations between the brain and the large computational language models, we propose a generative language BCI that utilizes the capacity of a large language model (LLM) jointly with a semantic brain decoder to directly generate language from functional magnetic resonance imaging (fMRI) input. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli perceived, without prior knowledge of any pre-generated candidates. We compare the language generated from the presented model with a random control, pre-generated language selection approach, and a standard LLM, which generates common coherent text solely based on the next word likelihood according to statistical language training data. The proposed model is found to generate language that is more aligned with semantic stimulus in response to which brain input is sampled. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
Artificial Intelligence tools shed light on millions of proteins
In the past years, AlphaFold has revolutionised protein science. This Artificial Intelligence (AI) tool was trained on protein data collected by life scientists for over 50 years, and is able to predict the 3D shape of proteins with high accuracy. Its success prompted the modelling of an astounding 215 million proteins last year, providing insights into the shapes of almost any protein. This is particularly interesting for proteins that have not been studied experimentally, a complex and time-consuming process. "There are now many sources of protein information, containing valuable insights into how proteins evolve and work" says Joana Pereira, the leader of the study.