tabpfn
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (0.68)
- Information Technology (0.46)
- Banking & Finance (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- North America > United States > Wisconsin (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Maryland (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Overview (0.67)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Government (0.92)
- (2 more...)
- Oceania > New Zealand > North Island > Waikato (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Indonesia (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Banking & Finance (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Quality (0.92)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Oceania > New Zealand > North Island > Waikato (0.04)
- North America > United States > Wisconsin (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Banking & Finance (0.68)
A principled framework for uncertainty decomposition in TabPFN
Fortini, Sandra, Ng, Kenyon, Petrone, Sonia, Rousseau, Judith, Wei, Susan
TabPFN is a transformer that achieves state-of-the-art performance on supervised tabular tasks by amortizing Bayesian prediction into a single forward pass. However, there is currently no method for uncertainty decomposition in TabPFN. Because it behaves, in an idealised limit, as a Bayesian in-context learner, we cast the decomposition challenge as a Bayesian predictive inference (BPI) problem. The main computational tool in BPI, predictive Monte Carlo, is challenging to apply here as it requires simulating unmodeled covariates. We therefore pursue the asymptotic alternative, filling a gap in the theory for supervised settings by proving a predictive CLT under quasi-martingale conditions. We derive variance estimators determined by the volatility of predictive updates along the context. The resulting credible bands are fast to compute, target epistemic uncertainty, and achieve near-nominal frequentist coverage. For classification, we further obtain an entropy-based uncertainty decomposition.
- Oceania > Australia (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Michigan (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Predicting Mycotoxin Contamination in Irish Oats Using Deep and Transfer Learning
Inglis, Alan, Doohan, Fiona, Natarajan, Subramani, McNulty, Breige, Elliott, Chris, Nugent, Anne, Meneely, Julie, Greer, Brett, Kildea, Stephen, Bucur, Diana, Danaher, Martin, Di Rocco, Melissa, Black, Lisa, Gauley, Adam, McKenna, Naoise, Parnell, Andrew
Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.
- Europe > Austria > Vienna (0.14)
- Europe > Italy (0.14)
- North America > United States > Virginia (0.04)
- (3 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
- Materials > Chemicals > Commodity Chemicals (0.47)
- Food & Agriculture > Agriculture > Pest Control (0.47)
Amortized Causal Discovery with Prior-Fitted Networks
Sypniewski, Mateusz, Olko, Mateusz, Gajewski, Mateusz, Miłoś, Piotr
In recent years, differentiable penalized likelihood methods have gained popularity, optimizing the causal structure by maximizing its likelihood with respect to the data. However, recent research has shown that errors in likelihood estimation, even on relatively large sample sizes, disallow the discovery of proper structures. We propose a new approach to amortized causal discovery that addresses the limitations of likelihood estimator accuracy. Our method leverages Prior-Fitted Networks (PFNs) to amortize data-dependent likelihood estimation, yielding more reliable scores for structure learning. Experiments on synthetic, simulated, and real-world datasets show significant gains in structure recovery compared to standard baselines. Furthermore, we demonstrate directly that PFNs provide more accurate likelihood estimates than conventional neural network-based approaches.
- Europe > Austria > Vienna (0.14)
- Europe > Poland > Masovia Province > Warsaw (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (4 more...)
State-Space Models for Tabular Prior-Data Fitted Networks
Koch, Felix, Wever, Marcel, Raisch, Fabian, Tischler, Benjamin
Recent advancements in foundation models for tabular data, such as TabPFN, demonstrated that pretrained Transformer architectures can approximate Bayesian inference with high predictive performance. However, Transformers suffer from quadratic complexity with respect to sequence length, motivating the exploration of more efficient sequence models. In this work, we investigate the potential of using Hydra, a bidirectional linear-time structured state space model (SSM), as an alternative to Transformers in TabPFN. A key challenge lies in SSM's inherent sensitivity to the order of input tokens - an undesirable property for tabular datasets where the row order is semantically meaningless. We investigate to what extent a bidirectional approach can preserve efficiency and enable symmetric context aggregation. Our experiments show that this approach reduces the order-dependence, achieving predictive performance competitive to the original TabPFN model.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > Canada (0.04)
- Europe > Germany > Lower Saxony > Hanover (0.04)
Robust Tabular Foundation Models
Peroni, Matthew, Le, Franck, Sheinin, Vadim
The development of tabular foundation models (TFMs) has accelerated in recent years, showing strong potential to outperform traditional ML methods for structured data. A key finding is that TFMs can be pretrained entirely on synthetic datasets, opening opportunities to design data generators that encourage desirable model properties. Prior work has mainly focused on crafting high-quality priors over generators to improve overall pretraining performance. Our insight is that parameterizing the generator distribution enables an adversarial robustness perspective: during training, we can adapt the generator to emphasize datasets that are particularly challenging for the model. We formalize this by introducing an optimality gap measure, given by the difference between TFM performance and the best achievable performance as estimated by strong baselines such as XGBoost, CatBoost, and Random Forests. Building on this idea, we propose Robust Tabular Foundation Models (RTFM), a model-agnostic adversarial training framework. Applied to the TabPFN V2 classifier, RTFM improves benchmark performance, with up to a 6% increase in mean normalized AUC over the original TabPFN and other baseline algorithms, while requiring less than 100k additional synthetic datasets. These results highlight a promising new direction for targeted adversarial training and fine-tuning of TFMs using synthetic data alone.
- North America > United States > Wisconsin (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
From Tables to Signals: Revealing Spectral Adaptivity in TabPFN
Zheng, Jianqiao, Gordon, Cameron, Ji, Yiping, Saratchandran, Hemanth, Lucey, Simon
Task-agnostic tabular foundation models such as TabPFN have achieved impressive performance on tabular learning tasks, yet the origins of their inductive biases remain poorly understood. In this work, we study TabPFN through the lens of signal reconstruction and provide the first frequency-based analysis of its in-context learning behavior. We show that TabPFN possesses a broader effective frequency capacity than standard ReLU-MLPs, even without hyperparameter tuning. Moreover, unlike MLPs whose spectra evolve primarily over training epochs, we find that TabPFN's spectral capacity adapts directly to the number of samples provided in-context, a phenomenon we term Spectral Adaptivity. We further demonstrate that positional encoding modulates TabPFN's frequency response, mirroring classical results in implicit neural representations. Finally, we show that these properties enable TabPFN to perform training-free and hyperparameter-free image denoising, illustrating its potential as a task-agnostic implicit model. Our analysis provides new insight into the structure and inductive biases of tabular foundation models and highlights their promise for broader signal reconstruction tasks.
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)