Goto

Collaborating Authors

 adjacency


is as powerful as CWL with the generalised update rule HASH ct,ctB(),ctC(),ct# (),ct " ()

Neural Information Processing Systems

A.1 Cellular WLResults In this section, we assume basic familiarity with the WL test and its higher-order variants. For an introduction to these topics, we refer the reader to the survey of Sato [62]. We begin by introducing a few useful concepts. A cellular colouring is a map c that maps a cell complex X and one of its cells to a colour from a fixed colour palette. Let X,Y be two regular cell complexes and c a cellular colouring. We say that X,Y are c-similar, denoted by cX = cY, if the number of cells in X coloured with a given colour equals the number of cells in Y with the same colour. Otherwise, we have cX 6= cY . We emphasise that in this paper we are interested only in colourings c with the property that any two isomorphic cell complexes are c-similar. A cellular colouring c refines a cellular colouring d, denoted by c v d, if for all cell complexes X and Y and all 2 PX and 2 PY, cX = cY implies dX = dY . Additionally, if d v c, we say the two colourings are equivalent and we represent it by c d. We state the following result from Bodnar et al. [8] about simplicial colourings, which we translate here directly to cell complexes. The proof is however, identical, and we refer the reader to their work for that. Let X,Y be any regular cellular complexes with A PX and B PY . Consider two cellular colourings c,d such that c v d.


Weisfeiler and Lehman Go Cellular: CWNetworks

Neural Information Processing Systems

Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models can be severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs.


QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model

Neural Information Processing Systems

Recent advancements in State Space Models, notably Mamba, have demonstrated superior performance over the dominant Transformer models, particularly in reducing the computational complexity from quadratic to linear. Yet, difficulties in adapting Mamba from language to vision tasks arise due to the distinct characteristics of visual data, such as the spatial locality and adjacency within images and large variations in information granularity across visual tokens. Existing vision Mamba approaches either flatten tokens into sequences in a raster scan fashion, which breaks the local adjacency of images, or manually partition tokens into windows, which limits their long-range modeling and generalization capabilities. To address these limitations, we present a new vision Mamba model, coined QuadMamba, that effectively captures local dependencies of varying granularities via quadtree-based image partition and scan. Concretely, our lightweight quadtree-based scan module learns to preserve the 2D locality of spatial regions within learned window quadrants.






Constraint- and Score-Based Nonlinear Granger Causality Discovery with Kernels

arXiv.org Machine Learning

Granger causality (GC) [15] is a time series causal discovery framework that uses predictive modeling to identify the underlying causal structure of a time series system. Relying on the assumption that cause precedes effect, GC assesses whether including the lagged information from one time series in the autoregressive model of a second time series enhances its predictions. This improvement indicates a predictive relationship between the time series variables, where one time series provides supplemental information about the future of another time series, thereby signifying the presence of a (Granger) causal relationship. GC requires only observational data, and has been used for time series causal discovery across diverse domains, including climate science [33], political and social sciences [17], econometrics [4], and biological systems studies [13]. The original formulation of GC requires several assumptions to be satisfied for causal identifiability. In regards to the candidate time series system, it is assumed that the time series variables are stationary, and that all variables are observed (absence of latent confounders). GC was initially proposed for bivariate time series systems, but was generalised for the multivariate setting to accommodate the assumption that all relevant variables are included in the analysis [15]. Additional assumptions are made with regard to the types of causal relationships that can be identified within the time series system. GC cannot estimate a causal relationship between time series at an instantaneous time point, relying on the relationship between the lags and predicted values to determine a GC relationship.


Sparse Bayesian Message Passing under Structural Uncertainty

arXiv.org Machine Learning

Semi-supervised learning on real-world graphs is frequently challenged by heterophily, where the observed graph is unreliable or label-disassortative. Many existing graph neural networks either rely on a fixed adjacency structure or attempt to handle structural noise through regularization. In this work, we explicitly capture structural uncertainty by modeling a posterior distribution over signed adjacency matrices, allowing each edge to be positive, negative, or absent. We propose a sparse signed message passing network that is naturally robust to edge noise and heterophily, which can be interpreted from a Bayesian perspective. By combining (i) posterior marginalization over signed graph structures with (ii) sparse signed message aggregation, our approach offers a principled way to handle both edge noise and heterophily. Experimental results demonstrate that our method outperforms strong baseline models on heterophilic benchmarks under both synthetic and real-world structural noise. We provide an anonymous repository at: https://anonymous.4open.science/r/SpaM-F2C8


InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

arXiv.org Artificial Intelligence

As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question. We present InvisibleInk, a highly scalable long-form text generation framework satisfying rigorous differential privacy guarantees with respect to the sensitive reference texts. It interprets sampling from the LLM's next-token-distribution as the exponential mechanism over the LLM logits with two innovations. First, we reduce the privacy cost by isolating and clipping only the sensitive information in the model logits (relative to the public logits). Second, we improve text quality by sampling without any privacy cost from a small superset of the top-$k$ private tokens. Empirical evaluations demonstrate a consistent $8\times$ (or more) reduction in computation cost over state-of-the-art baselines to generate long-form private text of the same utility across privacy levels. InvisibleInk is able to generate, for the first time, high-quality private long-form text at less than $4$-$8\times$ times the computation cost of non-private generation, paving the way for its practical use. We open-source a pip-installable Python package (invink) for InvisibleInk at https://github.com/cerai-iitm/invisibleink.