ft-transformer
- Europe > Russia (0.05)
- Asia > Russia (0.05)
- North America > United States > California (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Predicting Mycotoxin Contamination in Irish Oats Using Deep and Transfer Learning
Inglis, Alan, Doohan, Fiona, Natarajan, Subramani, McNulty, Breige, Elliott, Chris, Nugent, Anne, Meneely, Julie, Greer, Brett, Kildea, Stephen, Bucur, Diana, Danaher, Martin, Di Rocco, Melissa, Black, Lisa, Gauley, Adam, McKenna, Naoise, Parnell, Andrew
Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.
- Europe > Austria > Vienna (0.14)
- Europe > Italy (0.14)
- North America > United States > Virginia (0.04)
- (3 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
- Materials > Chemicals > Commodity Chemicals (0.47)
- Food & Agriculture > Agriculture > Pest Control (0.47)
Application of Tabular Transformer Architectures for Operating System Fingerprinting
Pérez-Jove, Rubén, Munteanu, Cristian R., Pazos, Alejandro, Vázquez-Naya, Jose
Operating System (OS) fingerprinting is essential for network management and cybersecurity, enabling accurate device identification based on network traffic analysis. Traditional rule-based tools such as Nmap and p0f face challenges in dynamic environments due to frequent OS updates and obfuscation techniques. While Machine Learning (ML) approaches have been explored, Deep Learning (DL) models, particularly Transformer architectures, remain unexploited in this domain. This study investigates the application of Tabular Transformer architectures-specifically TabTransformer and FT-Transformer-for OS fingerprinting, leveraging structured network data from three publicly available datasets. Our experiments demonstrate that FT-Transformer generally outperforms traditional ML models, previous approaches and TabTransformer across multiple classification levels (OS family, major, and minor versions). The results establish a strong foundation for DL-based OS fingerprinting, improving accuracy and adaptability in complex network environments. Furthermore, we ensure the reproducibility of our research by providing an open-source implementation.
- Europe > Spain > Galicia > A Coruña Province > A Coruña (0.04)
- Europe > Switzerland (0.04)
- Europe > Spain > Basque Country (0.04)
- (2 more...)
- Telecommunications (1.00)
- Information Technology > Security & Privacy (1.00)
On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning
Thielmann, Anton Frederik, Samiee, Soheila
Recent advancements in tabular deep learning (DL) have led to substantial performance improvements, surpassing the capabilities of traditional models. With the adoption of techniques from natural language processing (NLP), such as language model-based approaches, DL models for tabular data have also grown in complexity and size. Although tabular datasets do not typically pose scalability issues, the escalating size of these models has raised efficiency concerns. Despite its importance, efficiency has been relatively underexplored in tabular DL research. This paper critically examines the latest innovations in tabular DL, with a dual focus on performance and computational efficiency.
- North America > Canada (0.05)
- North America > United States > California (0.04)
- Europe > Germany (0.04)
Mambular: A Sequential Model for Tabular Deep Learning
Thielmann, Anton Frederik, Kumar, Manish, Weisser, Christoph, Reuter, Arik, Säfken, Benjamin, Samiee, Soheila
The analysis of tabular data has traditionally been dominated by gradient-boosted decision trees (GBDTs), known for their proficiency with mixed categorical and numerical features. However, recent deep learning innovations are challenging this dominance. We introduce Mambular, an adaptation of the Mamba architecture optimized for tabular data. We extensively benchmark Mambular against state-of-the-art models, including neural networks and tree-based methods, and demonstrate its competitive performance across diverse datasets. Additionally, we explore various adaptations of Mambular to understand its effectiveness for tabular data. We investigate different pooling strategies, feature interaction mechanisms, and bi-directional processing. Our analysis shows that interpreting features as a sequence and passing them through Mamba layers results in surprisingly performant models.
- North America > United States > California (0.05)
- North America > Canada (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
PTaRL: Prototype-based Tabular Representation Learning via Space Calibration
Ye, Hangting, Fan, Wei, Song, Xiaozhuang, Zheng, Shun, Zhao, He, Guo, Dandan, Chang, Yi
Tabular data have been playing a mostly important role in diverse real-world fields, such as healthcare, engineering, finance, etc. With the recent success of deep learning, many tabular machine learning (ML) methods based on deep networks (e.g., Transformer, ResNet) have achieved competitive performance on tabular benchmarks. However, existing deep tabular ML methods suffer from the representation entanglement and localization, which largely hinders their prediction performance and leads to performance inconsistency on tabular tasks. To overcome these problems, we explore a novel direction of applying prototype learning for tabular ML and propose a prototype-based tabular representation learning framework, PTaRL, for tabular prediction tasks. The core idea of PTaRL is to construct prototype-based projection space (P-Space) and learn the disentangled representation around global data prototypes. Specifically, PTaRL mainly involves two stages: (i) Prototype Generation, that constructs global prototypes as the basis vectors of P-Space for representation, and (ii) Prototype Projection, that projects the data samples into P-Space and keeps the core global data information via Optimal Transport. Then, to further acquire the disentangled representations, we constrain PTaRL with two strategies: (i) to diversify the coordinates towards global prototypes of different representations within P-Space, we bring up a diversification constraint for representation calibration; (ii) to avoid prototype entanglement in P-Space, we introduce a matrix orthogonalization constraint to ensure the independence of global prototypes. Finally, we conduct extensive experiments in PTaRL coupled with state-of-the-art deep tabular ML models on various tabular benchmarks and the results have shown our consistent superiority.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (2 more...)
Tabular Data: Is Attention All You Need?
Zabërgja, Guri, Kadra, Arlind, Grabocka, Josif
Deep Learning has revolutionized the field of AI and led to remarkable achievements in applications involving image and text data. Unfortunately, there is inconclusive evidence on the merits of neural networks for structured tabular data. In this paper, we introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on tabular data, but also transformer-based architectures against traditional multi-layer perceptrons (MLP) with residual connections. In contrast to prior work, our empirical findings indicate that neural networks are competitive against decision trees. Furthermore, we assess that transformer-based architectures do not outperform simpler variants of traditional MLP architectures on tabular datasets. As a result, this paper helps the research and practitioner communities make informed choices on deploying neural networks on future tabular data applications.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data
Pleiter, Bart, Tajalli, Behrad, Koffas, Stefanos, Abad, Gorka, Xu, Jing, Larson, Martha, Picek, Stjepan
Deep Neural Networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions.More recently, DNNs for tabular data have gained increasing attention due to the rise of transformer models. Our research presents a comprehensive analysis of backdoor attacks on tabular data using DNNs, particularly focusing on transformers. Given the inherent complexities of tabular data, we explore the challenges of embedding backdoors. Through systematic experimentation across benchmark datasets, we uncover that transformer-based DNNs for tabular data are highly susceptible to backdoor attacks, even with minimal feature value alterations. We also verify that our attack can be generalized to other models, like XGBoost and DeepFM. Our results indicate nearly perfect attack success rates (approximately 100%) by introducing novel backdoor attack strategies to tabular data. Furthermore, we evaluate several defenses against these attacks, identifying Spectral Signatures as the most effective one. Our findings highlight the urgency of addressing such vulnerabilities and provide insights into potential countermeasures for securing DNN models against backdoors in tabular data.
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- (3 more...)
Unveiling the Power of Self-Attention for Shipping Cost Prediction: The Rate Card Transformer
Sreekar, P Aditya, Verma, Sahil, Madhavan, Varun, Persad, Abhishek
Amazon ships billions of packages to its customers annually within the United States. Shipping cost of these packages are used on the day of shipping (day 0) to estimate profitability of sales. Downstream systems utilize these days 0 profitability estimates to make financial decisions, such as pricing strategies and delisting loss-making products. However, obtaining accurate shipping cost estimates on day 0 is complex for reasons like delay in carrier invoicing or fixed cost components getting recorded at monthly cadence. Inaccurate shipping cost estimates can lead to bad decision, such as pricing items too low or high, or promoting the wrong product to the customers. Current solutions for estimating shipping costs on day 0 rely on tree-based models that require extensive manual engineering efforts. In this study, we propose a novel architecture called the Rate Card Transformer (RCT) that uses self-attention to encode all package shipping information such as package attributes, carrier information and route plan. Unlike other transformer-based tabular models, RCT has the ability to encode a variable list of one-to-many relations of a shipment, allowing it to capture more information about a shipment. For example, RCT can encode properties of all products in a package. Our results demonstrate that cost predictions made by the RCT have 28.82% less error compared to tree-based GBDT model. Moreover, the RCT outperforms the state-of-the-art transformer-based tabular model, FTTransformer, by 6.08%. We also illustrate that the RCT learns a generalized manifold of the rate card that can improve the performance of tree-based models.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Asia > India > West Bengal > Kharagpur (0.04)
- Transportation > Marine (1.00)
- Transportation > Freight & Logistics Services > Shipping (1.00)
Revisiting Deep Learning Models for Tabular Data
Gorishniy, Yury, Rubachev, Ivan, Khrulkov, Valentin, Babenko, Artem
The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.
- North America > United States > California (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)