Goto

Collaborating Authors

 South America


More holidaymakers using AI to plan trips

BBC News

More holidaymakers are turning to AI when planning or booking their trips, according to travel association ABTA. The body found that 8% of travellers were using AI - up from 4% last year - with younger holidaymakers more likely to use the technology when planning their trips. However, AI still lagged a long way behind more established methods - such as general internet searches and asking family and friends. Overall, the number of people taking a holiday continued a recent trend of climbing back towards pre-pandemic levels, ABTA said. The travel body described the increase in customers using AI as both a challenge and an opportunity.


MrBeast says AI advance is scary for YouTube creators

BBC News

MrBeast: AI means it's'scary times' for YouTube creators The world's biggest YouTuber, MrBeast, says the rapid advance of generative artificial intelligence (AI) is scary for the millions of creators currently making content for a living. AI tools that can create fully-formed videos from simple text prompts by users have made rapid advances in recent years. On social media, MrBeast, real name Jimmy Donaldson, asked what would happen to people like him when AI videos are just as good as normal videos. Fears about the impact AI will have on the jobs market are widespread - but particularly acute in the creative industries. In the film and video game industries, there has been extensive industrial action over the use of AI.


Russia-Ukraine war: List of key events, day 1,321

Al Jazeera

Can Ukraine restore its pre-war borders? Why are Tomahawk missiles for Ukraine a'red line' for Russia? Is Russia testing NATO with aerial incursions in Europe? The UN's International Atomic Energy Agency (IAEA) said that "two rounds of shelling struck around 1.25 km" [less than a mile] from the perimeter of Ukraine's Zaporizhzhia Nuclear Power Plant on Monday afternoon. IAEA chief Rafael Grossi warned the attacks came as the plant has been running on emergency diesel generators for almost two weeks after losing its external power source.


On Structured State-Space Duality

arXiv.org Machine Learning

Structured State-Space Duality (SSD) [Dao & Gu, ICML 2024] is an equivalence between a simple Structured State-Space Model (SSM) and a masked attention mechanism. In particular, a state-space model with a scalar-times-identity state matrix is equivalent to a masked self-attention with a $1$-semiseparable causal mask. Consequently, the same sequence transformation (model) has two algorithmic realizations: as a linear-time $O(T)$ recurrence or as a quadratic-time $O(T^2)$ attention. In this note, we formalize and generalize this duality: (i) we extend SSD from the scalar-identity case to general diagonal SSMs (diagonal state matrices); (ii) we show that these diagonal SSMs match the scalar case's training complexity lower bounds while supporting richer dynamics; (iii) we establish a necessary and sufficient condition under which an SSM is equivalent to $1$-semiseparable masked attention; and (iv) we show that such duality fails to extend to standard softmax attention due to rank explosion. Together, these results tighten bridge between recurrent SSMs and Transformers, and widen the design space for expressive yet efficient sequence models.


Divergence Phase Index: A Riesz-Transform Framework for Multidimensional Phase Difference Analysis

arXiv.org Machine Learning

We introduce the Divergence Phase Index (DPI), a novel framework for quantifying phase differences in one and multidimensional signals, grounded in harmonic analysis via the Riesz transform. Based on classical Hilbert Transform phase measures, the DPI extends these principles to higher dimensions, offering a geometry-aware metric that is invariant to intensity scaling and sensitive to structural changes. We applied this method on both synthetic and real-world datasets, including intracranial EEG (iEEG) recordings during epileptic seizures, high-resolution microscopy images, and paintings. In the 1D case, the DPI robustly detects hypersynchronization associated with generalized epilepsy, while in 2D, it reveals subtle, imperceptible changes in images and artworks. Additionally, it can detect rotational variations in highly isotropic microscopy images. The DPI's robustness to amplitude variations and its adaptability across domains enable its use in diverse applications from nonlinear dynamics, complex systems analysis, to multidimensional signal processing.


Score-based generative emulation of impact-relevant Earth system model outputs

arXiv.org Machine Learning

Policy targets evolve faster than the Couple Model Intercomparison Project cycles, complicating adaptation and mitigation planning that must often contend with outdated projections. Climate model output emulators address this gap by offering inexpensive surrogates that can rapidly explore alternative futures while staying close to Earth System Model (ESM) behavior. We focus on emulators designed to provide inputs to impact models. Using monthly ESM fields of near-surface temperature, precipitation, relative humidity, and wind speed, we show that deep generative models have the potential to model jointly the distribution of variables relevant for impacts. The specific model we propose uses score-based diffusion on a spherical mesh and runs on a single mid-range graphical processing unit. We introduce a thorough suite of diagnostics to compare emulator outputs with their parent ESMs, including their probability densities, cross-variable correlations, time of emergence, or tail behavior. We evaluate performance across three distinct ESMs in both pre-industrial and forced regimes. The results show that the emulator produces distributions that closely match the ESM outputs and captures key forced responses. They also reveal important failure cases, notably for variables with a strong regime shift in the seasonal cycle. Although not a perfect match to the ESM, the inaccuracies of the emulator are small relative to the scale of internal variability in ESM projections. We therefore argue that it shows potential to be useful in supporting impact assessment. We discuss priorities for future development toward daily resolution, finer spatial scales, and bias-aware training. Code is made available at https://github.com/shahineb/climemu.


Don't Pass$\mathtt{@}k$: A Bayesian Framework for Large Language Model Evaluation

arXiv.org Machine Learning

Pass$@k$ is widely used to report performance for LLM reasoning, but it often yields unstable, misleading rankings, especially when the number of trials (samples) is limited and compute is constrained. We present a principled Bayesian evaluation framework that replaces Pass$@k$ and average accuracy over $N$ trials (avg$@N$) with posterior estimates of a model's underlying success probability and credible intervals, yielding stable rankings and a transparent decision rule for differences. Evaluation outcomes are modeled as categorical (not just 0/1) with a Dirichlet prior, giving closed-form expressions for the posterior mean and uncertainty of any weighted rubric and enabling the use of prior evidence when appropriate. Theoretically, under a uniform prior, the Bayesian posterior mean is order-equivalent to average accuracy (Pass$@1$), explaining its empirical robustness while adding principled uncertainty. Empirically, in simulations with known ground-truth success rates and on AIME'24/'25, HMMT'25, and BrUMO'25, the Bayesian/avg procedure achieves faster convergence and greater rank stability than Pass$@k$ and recent variants, enabling reliable comparisons at far smaller sample counts. The framework clarifies when observed gaps are statistically meaningful (non-overlapping credible intervals) versus noise, and it naturally extends to graded, rubric-based evaluations. Together, these results recommend replacing Pass$@k$ for LLM evaluation and ranking with a posterior-based, compute-efficient protocol that unifies binary and non-binary evaluation while making uncertainty explicit. Code is available at https://mohsenhariri.github.io/bayes-kit


Self-Speculative Masked Diffusions

arXiv.org Machine Learning

We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data that require significantly fewer function evaluations to generate samples. Standard masked diffusion models predict factorized logits over currently masked positions. A number of masked positions are then sampled, however, the factorization approximation means that sampling too many positions in one go leads to poor sample quality. As a result, many simulation steps and therefore neural network function evaluations are required to generate high-quality data. We reduce the computational burden by generating non-factorized predictions over masked positions. This is achieved by modifying the final transformer attention mask from non-causal to causal, enabling draft token generation and parallel validation via a novel, model-integrated speculative sampling mechanism. This results in a non-factorized predictive distribution over masked positions in a single forward pass. We apply our method to GPT2 scale text modelling and protein sequences generation, finding that we can achieve a ~2x reduction in the required number of network forward passes relative to standard masked diffusion models.


Cost Efficient Fairness Audit Under Partial Feedback

arXiv.org Machine Learning

We study the problem of auditing the fairness of a given classifier under partial feedback, where true labels are available only for positively classified individuals, (e.g., loan repayment outcomes are observed only for approved applicants). We introduce a novel cost model for acquiring additional labeled data, designed to more accurately reflect real-world costs such as credit assessment, loan processing, and potential defaults. Our goal is to find optimal fairness audit algorithms that are more cost-effective than random exploration and natural baselines. In our work, we consider two audit settings: a black-box model with no assumptions on the data distribution, and a mixture model, where features and true labels follow a mixture of exponential family distributions. In the black-box setting, we propose a near-optimal auditing algorithm under mild assumptions and show that a natural baseline can be strictly suboptimal. In the mixture model setting, we design a novel algorithm that achieves significantly lower audit cost than the black-box case. Our approach leverages prior work on learning from truncated samples and maximum-a-posteriori oracles, and extends known results on spherical Gaussian mixtures to handle exponential family mixtures, which may be of independent interest. Moreover, our algorithms apply to popular fairness metrics including demographic parity, equal opportunity, and equalized odds. Empirically, we demonstrate strong performance of our algorithms on real-world fair classification datasets like Adult Income and Law School, consistently outperforming natural baselines by around 50% in terms of audit cost.


From Moments to Models: Graphon Mixture-Aware Mixup and Contrastive Learning

arXiv.org Machine Learning

Real-world graph datasets often consist of mixtures of populations, where graphs are generated from multiple distinct underlying distributions. However, modern representation learning approaches, such as graph contrastive learning (GCL) and augmentation methods like Mixup, typically overlook this mixture structure. In this work, we propose a unified framework that explicitly models data as a mixture of underlying probabilistic graph generative models represented by graphons. To characterize these graphons, we leverage graph moments (motif densities) to cluster graphs arising from the same model. This enables us to disentangle the mixture components and identify their distinct generative mechanisms. This model-aware partitioning benefits two key graph learning tasks: 1) It enables a graphon-mixture-aware mixup (GMAM), a data augmentation technique that interpolates in a semantically valid space guided by the estimated graphons, instead of assuming a single graphon per class. Additionally, by introducing a new model-aware objective, our proposed approach (termed MGCL) improves negative sampling by restricting negatives to graphs from other models. We establish a key theoretical guarantee: a novel, tighter bound showing that graphs sampled from graphons with small cut distance will have similar motif densities with high probability. Extensive experiments on benchmark datasets demonstrate strong empirical performance. In unsupervised learning, MGCL achieves state-of-the-art results, obtaining the top average rank across eight datasets. In supervised learning, GMAM consistently outperforms existing strategies, achieving new state-of-the-art accuracy in 6 out of 7 datasets.