Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings

Jun-19-2026, 10:56:07 GMT–Neural Information Processing Systems

Deep neural networks often underperform on tabular data due to sensitivity to irrelevant features and a spectral bias toward smooth, low-frequency functions, limiting their ability to capture sharp, high-frequency signals in low-label regimes. While self-supervised learning (SSL) holds promise in such settings, it remains challenging in tabular domains due to the limited availability of effective data augmentations. We introduce TANDEM (Tree-And-Neural Dual Encoder Model), a hybrid autoencoder that trains a neural encoder alongside an oblivious soft decision tree (OSDT) encoder, both guided by dedicated stochastic gating networks for sample-specific feature selection. The encoders share a decoder and are coupled via alignment losses, encouraging complementary yet consistent representations. The training-only use of the tree operates as model-based augmentation, nudging representations toward tabular-relevant features while preserving a lean inference path (only the neural encoder is deployed). Spectral analysis highlights distinct yet complementary inductive biases across encoders, and experiments on classification and regression benchmarks in low-label settings show consistent gains over strong deep, tree-based, and SSL baselines.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Jun-19-2026, 10:56:07 GMT

Conferences PDF

Add feedback

Genre:
- Research Report
  - New Finding (0.68)
  - Experimental Study (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found