Goto

Collaborating Authors

 spade



SPADE: Spatial Transcriptomics and Pathology Alignment Using a Mixture of Data Experts for an Expressive Latent Space

Redekop, Ekaterina, Pleasure, Mara, Wang, Zichen, Flores, Kimberly, Sisk, Anthony, Speier, William, Arnold, Corey W.

arXiv.org Artificial Intelligence

The rapid growth of digital pathology and advances in self-supervised deep learning have enabled the development of foundational models for various pathology tasks across diverse diseases. While multimodal approaches integrating diverse data sources have emerged, a critical gap remains in the comprehensive integration of whole-slide images (WSIs) with spatial tran-scriptomics (ST), which is crucial for capturing critical molecular heterogeneity beyond standard hematoxylin & eosin (H&E) staining. We introduce SPADE, a foundation model that integrates histopathology with ST data to guide image representation learning within a unified framework, in effect creating an ST-informed latent space. These authors contributed equally to this work. Pre-trained on the comprehensive HEST-1k dataset, SPADE is evaluated on 20 downstream tasks, demonstrating significantly superior few-shot performance compared to baseline models, highlighting the benefits of integrating morphological and molecular information into one latent space. Introduction High-resolution whole slide images (WSIs) have propelled the development of powerful deep learning foundation models in computational pathology, demonstrating robust performance across diverse tissue types and tasks [1, 2, 3, 4]. These models are typically trained using self-supervision, enabling learning from large unlabeled datasets and producing embeddings robust to institutional variations, including differences in staining procedures and other image-quality factors [5, 6, 7, 8]. By visually capturing cellular arrangement, WSIs enable the study of spatial organization and disorganization of cells in tissues, characterizations that are especially relevant in cancer research [9, 10]. In clinical settings, WSIs are commonly stained with hematoxylin & eosin (H&E), a two-color stain that highlights nuclei and cytoplasm but offers a limited view of molecular-level heterogeneity [11]. As tumor tissues are known to exhibit high variability within and across patients, deciphering the heterogeneity at the molecular level is critical for improving deep learning applications that can more precisely inform diagnosis, treatment, and prognosis [12, 13]. While H&E provides crucial morphological insights, its inability to capture molecular heterogeneity limits its utility in fully characterizing tissue complexity. Spatial transcriptomics addresses this gap by providing spatially resolved gene expression data, allowing for additional molecular context for a given tissue specimen. Although both ST and H&E data have independently proven useful in various applications, their combined potential for creating a more comprehensive representation learning framework remains unexplored. To this end, we introduce SPADE, a vision-ST foundation model that uses a mixture of experts, each trained via contrastive learning, to unify ST data and H&E images to produce slide representations that encompass both modalities.


Response to Reviewer

Neural Information Processing Systems

"...Provide some evidence that the parameterizations relate to structural properties of failure graphs arising from real There are many application scenarios where our parameterizations are reasonable. "Analyze the complexity of these problems in the more common scenarios 1, 2, and 3. Do the problems remain NP-hard We will add a specific remark about the above into the final version. "the paper said that it would use a spade to mark statements with omitted proofs, but they were not actually marked with We apologize for this confusion. We will fix this in the final version. "It might be helpful to add a chart to the introduction, indicating the map from parameter to tractable/intractable" Thank you for this suggestion.


Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry

Hindupur, Sai Sumedh R., Lubana, Ekdeep Singh, Fel, Thomas, Ba, Demba

arXiv.org Artificial Intelligence

Sparse Autoencoders (SAEs) are widely used to interpret neural networks by identifying meaningful concepts from their representations. However, do SAEs truly uncover all concepts a model relies on, or are they inherently biased toward certain kinds of concepts? We introduce a unified framework that recasts SAEs as solutions to a bilevel optimization problem, revealing a fundamental challenge: each SAE imposes structural assumptions about how concepts are encoded in model representations, which in turn shapes what it can and cannot detect. This means different SAEs are not interchangeable -- switching architectures can expose entirely new concepts or obscure existing ones. To systematically probe this effect, we evaluate SAEs across a spectrum of settings: from controlled toy models that isolate key variables, to semi-synthetic experiments on real model activations and finally to large-scale, naturalistic datasets. Across this progression, we examine two fundamental properties that real-world concepts often exhibit: heterogeneity in intrinsic dimensionality (some concepts are inherently low-dimensional, others are not) and nonlinear separability. We show that SAEs fail to recover concepts when these properties are ignored, and we design a new SAE that explicitly incorporates both, enabling the discovery of previously hidden concepts and reinforcing our theoretical insights. Our findings challenge the idea of a universal SAE and underscores the need for architecture-specific choices in model interpretability. Overall, we argue an SAE does not just reveal concepts -- it determines what can be seen at all.


$\spadesuit$ SPADE $\spadesuit$ Split Peak Attention DEcomposition

Wolff, Malcolm, Olivares, Kin G., Oreshkin, Boris, Ruan, Sunny, Yang, Sitan, Katoch, Abhinav, Ramasubramanian, Shankar, Zhang, Youxin, Mahoney, Michael W., Efimov, Dmitry, Quenneville-Bélair, Vincent

arXiv.org Machine Learning

Demand forecasting faces challenges induced by Peak Events (PEs) corresponding to special periods such as promotions and holidays. Peak events create significant spikes in demand followed by demand ramp down periods. Neural networks like MQCNN [14, 7] and MQT [3] overreact to demand peaks by carrying over the elevated PE demand into subsequent Post-Peak-Event (PPE) periods, resulting in significantly over-biased forecasts. To tackle this challenge, we introduce a neural forecasting model called Split Peak Attention DEcomposition, SPADE. This model reduces the impact of PEs on subsequent forecasts by modeling forecasting as consisting of two separate tasks: one for PEs; and the other for the rest. Its architecture then uses masked convolution filters and a specialized Peak Attention module. We show SPADE's performance on a worldwide retail dataset with hundreds of millions of products. Our results reveal an overall PPE improvement of 4.5%, a 30% improvement for most affected forecasts after promotions and holidays, and an improvement in PE accuracy by 3.9%, relative to current production models.


Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary

Tao, Meiling, Liang, Xuechen, Tao, Yiling, Shi, Tianyu

arXiv.org Artificial Intelligence

Recent advancements in large language models (LLMs) have unlocked the potential for generating high-quality game commentary. However, producing insightful and engaging commentary for complex games with incomplete information remains a significant challenge. In this paper, we introduce a novel commentary method that combine Reinforcement Learning (RL) and LLMs, tailored specifically for the Chinese card game \textit{Guandan}. Our system leverages RL to generate intricate card-playing scenarios and employs LLMs to generate corresponding commentary text, effectively emulating the strategic analysis and narrative prowess of professional commentators. The framework comprises a state commentary guide, a Theory of Mind (ToM)-based strategy analyzer, and a style retrieval module, which seamlessly collaborate to deliver detailed and context-relevant game commentary in the Chinese language environment. We empower LLMs with ToM capabilities and refine both retrieval and information filtering mechanisms. This facilitates the generation of personalized commentary content. Our experimental results showcase the substantial enhancement in performance achieved by the proposed commentary framework when applied to open-source LLMs, surpassing the performance of GPT-4 across multiple evaluation metrics.


Synthesizing Traffic Datasets using Graph Neural Networks

Rodriguez-Criado, Daniel, Chli, Maria, Manso, Luis J., Vogiatzis, George

arXiv.org Artificial Intelligence

Traffic congestion in urban areas presents significant challenges, and Intelligent Transportation Systems (ITS) have sought to address these via automated and adaptive controls. However, these systems often struggle to transfer simulated experiences to real-world scenarios. This paper introduces a novel methodology for bridging this `sim-real' gap by creating photorealistic images from 2D traffic simulations and recorded junction footage. We propose a novel image generation approach, integrating a Conditional Generative Adversarial Network with a Graph Neural Network (GNN) to facilitate the creation of realistic urban traffic images. We harness GNNs' ability to process information at different levels of abstraction alongside segmented images for preserving locality data. The presented architecture leverages the power of SPADE and Graph ATtention (GAT) network models to create images based on simulated traffic scenarios. These images are conditioned by factors such as entity positions, colors, and time of day. The uniqueness of our approach lies in its ability to effectively translate structured and human-readable conditions, encoded as graphs, into realistic images. This advancement contributes to applications requiring rich traffic image datasets, from data augmentation to urban traffic solutions. We further provide an application to test the model's capabilities, including generating images with manually defined positions for various entities.


SPADE: Sparsity-Guided Debugging for Deep Neural Networks

Moakhar, Arshia Soltani, Iofinova, Eugenia, Alistarh, Dan

arXiv.org Artificial Intelligence

Interpretability, broadly defined as mechanisms for understanding why and how machine learning models reach their decisions, is one of the key open goals at the intersection of deep learning theory and practice. Towards this goal, multiple tools have been proposed to aid a human examiner in reasoning about a network's behavior in general or on a set of instances. However, the outputs of these tools-such as input saliency maps or neuron visualizations-are frequently difficult for a human to interpret, or even misleading, due, in particular, to the fact that neurons can be multifaceted, i.e., a single neuron can be associated with multiple distinct feature combinations. In this paper, we present a new general approach to address this problem, called SPADE, which, given a trained model and a target sample, uses sample-targeted pruning to provide a "trace" of the network's execution on the sample, reducing the network to the connections that are most relevant to the specific prediction. We demonstrate that preprocessing with SPADE significantly increases both the accuracy of image saliency maps across several interpretability methods and the usefulness of neuron visualizations, aiding humans in reasoning about network behavior. Our findings show that sample-specific pruning of connections can disentangle multifaceted neurons, leading to consistently improved interpretability.


Belief Revision from Probability

Goodman, Jeremy, Salow, Bernhard

arXiv.org Artificial Intelligence

In previous work ("Knowledge from Probability", TARK 2021) we develop a question-relative, probabilistic account of belief. On this account, what someone believes relative to a given question is (i) closed under entailment, (ii) sufficiently probable given their evidence, and (iii) sensitive to the relative probabilities of the answers to the question. Here we explore the implications of this account for the dynamics of belief. We show that the principles it validates are much weaker than those of orthodox theories of belief revision like AGM, but still stronger than those valid according to the popular Lockean theory of belief, which equates belief with high subjective probability. We then consider a restricted class of models, suitable for many but not all applications, and identify some further natural principles valid on this class. We conclude by arguing that the present framework compares favorably to the rival probabilistic accounts of belief developed by Leitgeb and by Lin and Kelly.


Towards Computationally Efficient Responsibility Attribution in Decentralized Partially Observable MDPs

Triantafyllou, Stelios, Radanovic, Goran

arXiv.org Artificial Intelligence

Responsibility attribution is a key concept of accountable multi-agent decision making. Given a sequence of actions, responsibility attribution mechanisms quantify the impact of each participating agent to the final outcome. One such popular mechanism is based on actual causality, and it assigns (causal) responsibility based on the actions that were found to be pivotal for the considered outcome. However, the inherent problem of pinpointing actual causes and consequently determining the exact responsibility assignment has shown to be computationally intractable. In this paper, we aim to provide a practical algorithmic solution to the problem of responsibility attribution under a computational budget. We first formalize the problem in the framework of Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) augmented by a specific class of Structural Causal Models (SCMs). Under this framework, we introduce a Monte Carlo Tree Search (MCTS) type of method which efficiently approximates the agents' degrees of responsibility. This method utilizes the structure of a novel search tree and a pruning technique, both tailored to the problem of responsibility attribution. Other novel components of our method are (a) a child selection policy based on linear scalarization and (b) a backpropagation procedure that accounts for a minimality condition that is typically used to define actual causality. We experimentally evaluate the efficacy of our algorithm through a simulation-based test-bed, which includes three team-based card games.