Goto

Collaborating Authors

 eagle


Bald eagle 'massaging' its mate? AI deepfakes collide with the laws of the wild

Los Angeles Times

Things to Do in L.A. This is read by an automated voice. Please report any issues or inconsistencies here . AI-generated videos of Big Bear's celebrity bald eagles, Jackie and Shadow, are racking up millions of views, tricking fans with realistic but invented behaviors like eagle "massages." They're part of a wave of deepfake wildlife videos taking over social media that experts warn may create a false sense of safety around predators and erode the perceived urgency of conservation efforts.


Informative Perturbation Selection for Uncertainty-Aware Post-hoc Explanations

Chugh, Sumedha, Prasad, Ranjitha, Shah, Nazreen

arXiv.org Machine Learning

Trust and ethical concerns due to the widespread deployment of opaque machine learning (ML) models motivating the need for reliable model explanations. Post-hoc model-agnostic explanation methods addresses this challenge by learning a surrogate model that approximates the behavior of the deployed black-box ML model in the locality of a sample of interest. In post-hoc scenarios, neither the underlying model parameters nor the training are available, and hence, this local neighborhood must be constructed by generating perturbed inputs in the neighborhood of the sample of interest, and its corresponding model predictions. We propose \emph{Expected Active Gain for Local Explanations} (\texttt{EAGLE}), a post-hoc model-agnostic explanation framework that formulates perturbation selection as an information-theoretic active learning problem. By adaptively sampling perturbations that maximize the expected information gain, \texttt{EAGLE} efficiently learns a linear surrogate explainable model while producing feature importance scores along with the uncertainty/confidence estimates. Theoretically, we establish that cumulative information gain scales as $\mathcal{O}(d \log t)$, where $d$ is the feature dimension and $t$ represents the number of samples, and that the sample complexity grows linearly with $d$ and logarithmically with the confidence parameter $1/δ$. Empirical results on tabular and image datasets corroborate our theoretical findings and demonstrate that \texttt{EAGLE} improves explanation reproducibility across runs, achieves higher neighborhood stability, and improves perturbation sample quality as compared to state-of-the-art baselines such as Tilia, US-LIME, GLIME and BayesLIME.




Scaling can lead to compositional generalization

Redhardt, Florian, Akram, Yassir, Schug, Simon

arXiv.org Artificial Intelligence

Can neural networks systematically capture discrete, compositional task structure despite their continuous, distributed nature? The impressive capabilities of large-scale neural networks suggest that the answer to this question is yes. However, even for the most capable models, there are still frequent failure cases that raise doubts about their compositionality. Here, we seek to understand what it takes for a standard neural network to generalize over tasks that share compositional structure. We find that simply scaling data and model size leads to compositional generalization. We show that this holds across different task encodings as long as the training distribution sufficiently covers the task space. In line with this finding, we prove that standard multilayer perceptrons can approximate a general class of compositional task families to arbitrary precision using only a linear number of neurons with respect to the number of task modules. Finally, we uncover that if networks successfully compositionally generalize, the constituents of a task can be linearly decoded from their hidden activations. We show that this metric correlates with failures of text-to-image generation models to compose known concepts.


Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs

Liu, Hongyi, Huang, Jiaji, Jia, Zhen, Park, Youngsuk, Wang, Yu-Xiang

arXiv.org Artificial Intelligence

Speculative decoding is widely used in accelerating large language model (LLM) inference. In this work, we focus on the online draft model selection problem in speculative decoding. We design an algorithm that provably competes with the best draft model in hindsight for each query in terms of either the token acceptance probability or expected acceptance length. In particular, we show that we can accurately evaluate all draft models, instead of only the chosen model without incurring additional queries to the target model, which allows us to improve exponentially over the existing bandit-based approach as the number of draft models increases. Our approach is generically applicable with any speculative decoding methods (single draft, multi-drafts and draft-trees). Moreover, we design system-efficient versions of online learners and demonstrate that the overhead in computation and latency can be substantially reduced. We conduct extensive experiments on open-source LLMs and diverse datasets, demonstrating that our methods substantially outperform the state-of-the-art EAGLE3 and the BanditSpec baseline in a variety of domains where specialized domain-expert drafters are available, especially when long reasoning chains are required.


Linear Transformers Implicitly Discover Unified Numerical Algorithms

Lutz, Patrick, Gangrade, Aditya, Daneshmand, Hadi, Saligrama, Venkatesh

arXiv.org Artificial Intelligence

We train a linear attention transformer on millions of masked-block matrix completion tasks: each prompt is masked low-rank matrix whose missing block may be (i) a scalar prediction target or (ii) an unseen kernel slice of Nyström extrapolation. The model sees only input-output pairs and a mean-squared loss; it is given no normal equations, no handcrafted iterations, and no hint that the tasks are related. Surprisingly, after training, algebraic unrolling reveals the same parameter-free update rule across three distinct computational regimes (full visibility, rank-limited updates, and distributed computation). We prove that this rule achieves second-order convergence on full-batch problems, cuts distributed iteration complexity, and remains accurate with rank-limited attention. Thus, a transformer trained solely to patch missing blocks implicitly discovers a unified, resource-adaptive iterative solver spanning prediction, estimation, and Nyström extrapolation, highlighting a powerful capability of in-context learning.


Cross-Attention Speculative Decoding

Zhong, Wei, Bharadwaj, Manasa, Wang, Yixiao, Verma, Nikhil, Ji, Yipeng, Lee, Chul

arXiv.org Artificial Intelligence

Speculative decoding (SD) is a widely adopted approach for accelerating inference in large language models (LLMs), particularly when the draft and target models are well aligned. However, state-of-the-art SD methods typically rely on tightly coupled, self-attention-based Transformer decoders, often augmented with auxiliary pooling or fusion layers. This coupling makes them increasingly complex and harder to generalize across different models. We present Budget EAGLE (Beagle), the first, to our knowledge, cross-attention-based Transformer decoder SD model that achieves performance on par with leading self-attention SD models (EAGLE-v2) while eliminating the need for pooling or auxiliary components, simplifying the architecture, improving training efficiency, and maintaining stable memory usage during training-time simulation. To enable effective training of this novel architecture, we propose Two-Stage Block-Attention Training, a new method that achieves training stability and convergence efficiency in block-level attention scenarios. Extensive experiments across multiple LLMs and datasets show that Beagle achieves competitive inference speedups and higher training efficiency than EAGLE-v2, offering a strong alternative for architectures in speculative decoding.


The Next Layer: Augmenting Foundation Models with Structure-Preserving and Attention-Guided Learning for Local Patches to Global Context Awareness in Computational Pathology

Waqas, Muhammad, Bandyopadhyay, Rukhmini, Showkatian, Eman, Muneer, Amgad, Zafar, Anas, Alvarez, Frank Rojas, Marin, Maricel Corredor, Li, Wentao, Jaffray, David, Haymaker, Cara, Heymach, John, Vokes, Natalie I, Soto, Luisa Maren Solis, Zhang, Jianjun, Wu, Jia

arXiv.org Machine Learning

Foundation models have recently emerged as powerful feature extractors in computational pathology, yet they typically omit mechanisms for leveraging the global spatial structure of tissues and the local contextual relationships among diagnostically relevan t regions -- key elements for understanding the tumor microenvironment. Multiple instance learning (MIL) remains an essential next step following foundation model, designing a framework to aggregate patch - level features into slide - level predictions. We presen t EAGLE - Net, a structure - preserving, attention - guided MIL architecture designed to augment prediction and interpretability. EAGLE - Net integrates multi - scale absolute spatial encoding to capture global tissue architecture, a top - K neighborhood - aware loss to focus attention on local microenvironments, and background suppression loss to minimize false positives. We benchmarked EAGLE - Net on large pan - cancer datasets, including three cancer types for classification (10,260 slides) and seven cancer types for surv ival prediction (4,172 slides), using three distinct histology foundation backbones (REMEDIES, Uni - V1, Uni2 - h). Across tasks, EAGLE - Net achieved up to 3% higher classification accuracy and the top concordance indices in 6 of 7 cancer types, producing smoot h, biologically coherent attention maps that aligned with expert annotations and highlighted invasive fronts, necrosis, and immune infiltration. These results position EAGLE - Net as a generalizable, interpretable framework that complements foundation models, enabling improved biomarker discovery, prognostic modeling, and clinical decision support.


UK border officials to use AI to verify ages of child asylum seekers

The Guardian

Officials are to start using artificial intelligence to help estimate the age of asylum seekers who say they are children. Angela Eagle, the immigration minister, said on Tuesday the government would test technology that judges a person's age based on their facial features. It is the latest example of Labour ministers turning to AI to help solve problems with public services without spending significant amounts of money. The decision was announced on the same day that David Bolt, the chief inspector of borders and immigration, published a highly critical report into the haphazard way in which officials estimated the age of new arrivals. Eagle said in a written statement to parliament: "We have concluded that the most cost-effective option to pursue is likely to be facial age estimation, whereby AI technology – trained on millions of images where an individual's age is verifiable – is able to produce an age estimate with a known degree of accuracy for an individual whose age is unknown or disputed.