Europe
AllSim: Simulating and Benchmarking Resource Allocation Policies in Multi-User Systems
Numerous real-world systems, ranging from healthcare to energy grids, involve users competing for finite and potentially scarce resources. Designing policies for repeated resource allocation in such real-world systems is challenging for many reasons, including the changing nature of user types and their (possibly urgent) need for resources. Researchers have developed numerous machine learning solutions for determining repeated resource allocation policies in these challenging settings. However, a key limitation has been the absence of good methods and test-beds for benchmarking these policies; almost all resource allocation policies are benchmarked in environments which are either completely synthetic or do not allow any deviation from historical data. In this paper we introduce AllSim, which is a benchmarking environment for realistically simulating the impact and utility of policies for resource allocation in systems in which users compete for such scarce resources. Building such a benchmarking environment is challenging because it needs to successfully take into account the entire collective of potential users and the impact a resource allocation policy has on all the other users in the system. AllSim's benchmarking environment is modular (each component being parameterized individually), learnable (informed by historical data), and customizable (adaptable to changing conditions). These, when interacting with an allocation policy, produce a dataset of simulated outcomes for evaluation and comparison of such policies. We believe AllSim is an essential step towards a more systematic evaluation of policies for scarce resource allocation compared to current approaches for benchmarking such methods.
PROTES: Probabilistic Optimization with Tensor Sampling
We developed a new method PROTES for black-box optimization, which is based on the probabilistic sampling from a probability density function given in the low-parametric tensor train format. We tested it on complex multidimensional arrays and discretized multivariable functions taken, among others, from real-world applications, including unconstrained binary optimization and optimal control problems, for which the possible number of elements is up to 21000. In numerical experiments, both on analytic model functions and on complex problems, PROTES outperforms popular discrete optimization methods (Particle Swarm Optimization, Covariance Matrix Adaptation, Differential Evolution, and others).
What does the data tell us about immigration in Wales? Search for your area
What does the data tell us about immigration in Wales? Like many countries, Wales sees a steady flow of people arriving and leaving for other countries each year. The difference between those arriving and those leaving is known as net migration. Focusing on people moving from abroad, latest estimates say Wales' population - which was 3.2 million in June 2024 - had increased by about 23,000 over the previous year as a result of net international migration. A recent YouGov poll found a quarter of people surveyed in Wales believed that immigration, alongside the economy, should be among the issues prioritised by the Welsh government, even though immigration is controlled by the UK government.
Who will win title? The big prediction special
Image caption, Will Pep Guardiola or Mikel Arteta be lifting the Premier League trophy next month? With five games to go, Manchester City and Arsenal are only separated on goals scored at the top of the Premier League table. It's a new league now, says Gunners boss Mikel Arteta, whose side had been top of the table for 209 days until Wednesday. Manchester City's 2-1 win over Arsenal on Sunday boosted their hopes - and a 1-0 victory at Burnley on Wednesday sent them top. Who is going to win the title now?
An AI agent takes over a store and orders too many candles
Andon Market in San Francisco represents a vision, however flawed, of a future when more sophisticated AI agents take over work traditionally done by humans. In San Francisco's upscale Cow Hollow district, the introduction of a boutique selling coffee table games, tote bags and other household items would be pretty unremarkable. However, Andon Market has one key differentiator: It's run by AI. At this store, an artificial intelligence agent named Luna effectively acts as the chief executive officer of the operation. It decides what products to offer and how much to charge for them.
Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedback
We present a novel actor-critic algorithm for an environment with delayed feedback, which addresses the state-space explosion problem of conventional approaches. Conventional approaches use an augmented state constructed from the last observed state and actions executed since visiting the last observed state Using the augmented state space, the correct Markov decision process for delayed environments can be constructed; however, this causes the state space to explode as the number of delayed timesteps increases, leading to slow convergence. Our proposed algorithm, called Belief-Projection-Based Q-learning (BPQL), addresses the state-space explosion problem by evaluating the values of the critic for which the input state size is equal to the original state-space size rather than that of the augmented one. We compare BPQL to traditional approaches in continuous control tasks and demonstrate that it significantly outperforms other algorithms in terms of asymptotic performance and sample efficiency. We also show that BPQL solves long-delayed environments, which conventional approaches are unable to do.
Large language models transition from integrating across position-yoked, exponential windows to structure-yoked, power-law windows
Modern language models excel at integrating across long temporal scales needed to encode linguistic meaning and show non-trivial similarities to biological neural systems. Prior work suggests that human brain responses to language exhibit hierarchically organized "integration windows" that substantially constrain the overall influence of an input token (e.g., a word) on the neural response. However, little prior work has attempted to use integration windows to characterize computations in large language models (LLMs). We developed a simple word-swap procedure for estimating integration windows from black-box language models that does not depend on access to gradients or knowledge of the model architecture (e.g., attention weights). Using this method, we show that trained LLMs exhibit stereotyped integration windows that are well-fit by a convex combination of an exponential and a power-law function, with a partial transition from exponential to power-law dynamics across network layers. We then introduce a metric for quantifying the extent to which these integration windows vary with structural boundaries (e.g., sentence boundaries), and using this metric, we show that integration windows become increasingly yoked to structure at later network layers. None of these findings were observed in an untrained model, which as expected integrated uniformly across its input. These results suggest that LLMs learn to integrate information in natural language using a stereotyped pattern: integrating across position-yoked, exponential windows at early layers, followed by structure-yoked, power-law windows at later layers. The methods we describe in this paper provide a general-purpose toolkit for understanding temporal integration in language models, facilitating cross-disciplinary research at the intersection of biological and artificial intelligence.
China's DeepSeek unveils latest models a year after upending global tech
China's DeepSeek unveils latest models a year after upending global tech China's DeepSeek has unveiled the latest versions of its signature artificial intelligence-powered chatbot, a year after its flagship model sent shockwaves through the global tech scene. The Chinese start-up launched preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash on Friday as it touted its ability to go toe-to-toe with US rivals such as OpenAI and Google. The "flash" model has similar reasoning abilities to the "pro" version, while offering faster response times and more cost-effective pricing, the Hangzhou-based startup said. Like DeepSeek's previous chatbots, V4-Pro and V4-Flash follow an open-source model, meaning developers are free to use and modify them at will. The release comes after DeepSeek-R1 stunned the tech sector upon its launch in January last year with capabilities broadly comparable with those of ChatGPT and Gemini.