Goto

Collaborating Authors

 codeword




GT-SNT: A Linear-Time Transformer for Large-Scale Graphs via Spiking Node Tokenization

Zhang, Huizhe, Li, Jintang, Zhu, Yuchang, Zhong, Huazhen, Chen, Liang

arXiv.org Artificial Intelligence

Graph Transformers (GTs), which integrate message passing and self-attention mechanisms simultaneously, have achieved promising empirical results in graph prediction tasks. However, the design of scalable and topology-aware node tok-enization has lagged behind other modalities. This gap becomes critical as the quadratic complexity of full attention renders them impractical on large-scale graphs. Recently, Spiking Neural Networks (SNNs), as brain-inspired models, provided an energy-saving scheme to convert input intensity into discrete spike-based representations through event-driven spiking neurons. Inspired by these characteristics, we propose a linear-time Graph Transformer with Spiking Node Tokenization (GT -SNT) for node classification. By integrating multi-step feature propagation with SNNs, spiking node tokenization generates compact, locality-aware spike count embeddings as node tokens to avoid predefined code-books and their utilization issues. The codebook guided self-attention leverages these tokens to perform node-to-token attention for linear-time global context aggregation. In experiments, we compare GT -SNT with other state-of-the-art baselines on node classification datasets ranging from small to large. Experimental results show that GT -SNT achieves comparable performances on most datasets and reaches up to 130 faster inference speed compared to other GTs.


Consistency Flow Model Achieves One-step Denoising Error Correction Codes

Lei, Haoyu, Lau, Chin Wa, Zhou, Kaiwen, Guo, Nian, Farnia, Farzan

arXiv.org Artificial Intelligence

Error Correction Codes (ECC) are fundamental to reliable digital communication, yet designing neural decoders that are both accurate and computationally efficient remains challenging. Recent denoising diffusion decoders with transformer backbones achieve state-of-the-art performance, but their iterative sampling limits practicality in low-latency settings. We introduce the Error Correction Consistency Flow Model (ECCFM), an architecture-agnostic training framework for high-fidelity one-step decoding. By casting the reverse denoising process as a Probability Flow Ordinary Differential Equation (PF-ODE) and enforcing smoothness through a differential time regularization, ECCFM learns to map noisy signals along the decoding trajectory directly to the original codeword in a single inference step. Across multiple decoding benchmarks, ECCFM attains lower bit-error rates (BER) than autoregressive and diffusion-based baselines, with notable improvements on longer codes, while delivering inference speeds up from 30x to 100x faster than denoising diffusion decoders.


The Coding Limits of Robust Watermarking for Generative Models

Francati, Danilo, Goonatilake, Yevin Nikhel, Pawar, Shubham, Venturi, Daniele, Ateniese, Giuseppe

arXiv.org Artificial Intelligence

We ask a basic question about cryptographic watermarking for generative models: to what extent can a watermark remain reliable when an adversary is allowed to corrupt the encoded signal? To study this question, we introduce a minimal coding abstraction that we call a zero-bit tamper-detection code. This is a secret-key procedure that samples a pseudorandom codeword and, given a candidate word, decides whether it should be treated as unmarked content or as the result of tampering with a valid codeword. It captures the two core requirements of robust watermarking: soundness and tamper detection. Within this abstraction we prove a sharp unconditional limit on robustness to independent symbol corruption. For an alphabet of size $q$, there is a critical corruption rate of $1 - 1/q$ such that no scheme with soundness, even relaxed to allow a fixed constant false positive probability on random content, can reliably detect tampering once an adversary can change more than this fraction of symbols. In particular, in the binary case no cryptographic watermark can remain robust if more than half of the encoded bits are modified. We also show that this threshold is tight by giving simple information-theoretic constructions that achieve soundness and tamper detection for all strictly smaller corruption rates. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact.


Large Margin Discriminant Dimensionality Reduction in Prediction Space

Mohammad Saberian, Jose Costa Pereira, Nuno Nvasconcelos, Can Xu

Neural Information Processing Systems

In this paper we establish a duality between boosting and SVM, and use this to derive a novel discriminant dimensionality reduction algorithm. In particular, using the multiclass formulation of boosting and SVM we note that both use a combination of mapping and linear classification to maximize the multiclass margin.





Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning

Deproost, Senne, Steckelmacher, Dennis, Nowé, Ann

arXiv.org Artificial Intelligence

Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller has to meet regulations, or foster trust. To alleviate this, one could transfer the learned behaviour into a model that is human-readable by design using knowledge distilla- tion. Often this is done with a single model which mimics the original model on average but could struggle in more dynamic situations. A key challenge is that this simpler model should have the right balance be- tween flexibility and complexity or right balance between balance bias and accuracy. We propose a new model-agnostic method to divide the state space into regions where a simplified, human-understandable model can operate in. In this paper, we use Voronoi partitioning to find regions where linear models can achieve similar performance to the original con- troller. We evaluate our approach on a gridworld environment and a classic control task. We observe that our proposed distillation to locally- specialized linear models produces policies that are explainable and show that the distillation matches or even slightly outperforms the black-box policy they are distilled from.