neighbour
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Asia > Middle East > Jordan (0.04)
Retrieval & Fine-Tuning for In-Context Tabular Models
Tabular data is a pervasive modality spanning a wide range of domains, and this inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex tabular datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. Using TabPFN as the base model -- currently the best tabular in-context learner -- and applying our retrieval and fine-tuning scheme on top results in what we call a locally-calibrated PFN, or LoCalPFN. We conduct extensive evaluation on 95 datasets curated by TabZilla from OpenML, upon which we establish a new state-of-the-art with LoCalPFN -- even with respect to tuned tree-based models. Notably, we show a significant boost in performance compared to the base in-context model, demonstrating the efficacy of our approach and advancing the frontier of deep learning in tabular data.
Heterophily-informed Message Passing
Wang, Haishan, Solin, Arno, Garg, Vikas
Graph neural networks (GNNs) are known to be vulnerable to oversmoothing due to their implicit homophily assumption. We mitigate this problem with a novel scheme that regulates the aggregation of messages, modulating the type and extent of message passing locally thereby preserving both the low and high-frequency components of information. Our approach relies solely on learnt embeddings, obviating the need for auxiliary labels, thus extending the benefits of heterophily-aware embeddings to broader applications, e.g., generative modelling. Our experiments, conducted across various data sets and GNN architectures, demonstrate performance enhancements and reveal heterophily patterns across standard classification benchmarks. Furthermore, application to molecular generation showcases notable performance improvements on chemoinformatics benchmarks.
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- Europe > Finland (0.04)
- (2 more...)
Retrieval-Augmented Memory for Online Learning
Retrieval-augmented models couple parametric predictors with non-parametric memories, but their use in streaming supervised learning with concept drift is not well understood. We study online classification in non-stationary environments and propose Retrieval-Augmented Memory for Online Learning (RAM-OL), a simple extension of stochastic gradient descent that maintains a small buffer of past examples. At each time step, RAM-OL retrieves a few nearest neighbours of the current input in the hidden representation space and updates the model jointly on the current example and the retrieved neighbours. We compare a naive replay variant with a gated replay variant that constrains neighbours using a time window, similarity thresholds, and gradient reweighting, in order to balance fast reuse of relevant past data against robustness to outdated regimes. From a theoretical perspective, we interpret RAM-OL under a bounded drift model and discuss how retrieval can reduce adaptation cost and improve regret constants when patterns recur over time. Empirically, we instantiate RAM-OL on a simple online multilayer perceptron and evaluate it on three real-world data streams derived from electricity pricing, electricity load, and airline delay data. On strongly and periodically drifting streams, RAM-OL improves prequential accuracy by up to about seven percentage points and greatly reduces variance across random seeds, while on a noisy airline stream the gated variant closely matches the purely online baseline. These results show that retrieval-augmented memory is a practical and robust tool for online learning under concept drift.
- Oceania > Australia > New South Wales (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Education > Educational Setting > Online (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)
Improved Linear-Time Construction of Minimal Dominating Set via Mobile Agents
Chand, Prabhat Kumar, Molla, Anisur Rahaman
The use of autonomous agents to solve graph problems has recently attracted significant attention. Such agents, representing entities like self-driving cars, drones, robots, or distributed processes, combine two defining capabilities: they can perform local computations under strict memory constraints, and they can traverse networks, moving between nodes while retaining only limited information. A crucial observation in this model is that local computation cost is essentially negligible compared to movement, as in real-world scenarios where the cost of physical traversal (for example, a self-driven car traversing across mutiple cities) far outweighs local processing. Consequently, research in this area has focused on minimising movement while still enabling efficient solutions to classical graph problems. Several fundamental graph problems, such as computing minimal dominating sets and independent sets, leader election, spanning tree construction, and community detection, have been extensively studied both in the classical distributed model and, more recently, in the mobile-agent model. For instance, dominating set construction has been investigated in the mobile-agent setting [2] and refined in subsequent works [3, 4, 5], while the closely related maximal independent set (MIS) problem has also been explored [6]. The same framework has produced algorithms for spanning structures, including BFS trees [7, 8], MSTs [3, 5], and general spanning trees [9]. These developments have further led to increasingly efficient approaches for leader election.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Asia > China > Heilongjiang Province > Daqing (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)
- Information Technology (0.68)
- Consumer Products & Services (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Mining (0.69)
- North America > Canada > British Columbia (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Vision (0.67)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (0.68)
- Information Technology (0.46)
- Banking & Finance (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)