AITopics | ft-transformer

Collaborating Authors

ft-transformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9d86d83f925f2149e9edb0ac3b49229c-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 06:23:56 GMT

As a result, it is unclear for both researchers and practitioners what models perform best.

artificial intelligence, deep learning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.05)
Asia > Russia (0.05)
North America > United States > California (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Predicting Mycotoxin Contamination in Irish Oats Using Deep and Transfer Learning

Inglis, Alan, Doohan, Fiona, Natarajan, Subramani, McNulty, Breige, Elliott, Chris, Nugent, Anne, Meneely, Julie, Greer, Brett, Kildea, Stephen, Bucur, Diana, Danaher, Martin, Di Rocco, Melissa, Black, Lisa, Gauley, Adam, McKenna, Naoise, Parnell, Andrew

arXiv.org Machine LearningDec-30-2025

Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2512.22243

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Materials > Chemicals > Commodity Chemicals (0.47)
Food & Agriculture > Agriculture > Pest Control (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Application of Tabular Transformer Architectures for Operating System Fingerprinting

Pérez-Jove, Rubén, Munteanu, Cristian R., Pazos, Alejandro, Vázquez-Naya, Jose

arXiv.org Artificial IntelligenceFeb-13-2025

Operating System (OS) fingerprinting is essential for network management and cybersecurity, enabling accurate device identification based on network traffic analysis. Traditional rule-based tools such as Nmap and p0f face challenges in dynamic environments due to frequent OS updates and obfuscation techniques. While Machine Learning (ML) approaches have been explored, Deep Learning (DL) models, particularly Transformer architectures, remain unexploited in this domain. This study investigates the application of Tabular Transformer architectures-specifically TabTransformer and FT-Transformer-for OS fingerprinting, leveraging structured network data from three publicly available datasets. Our experiments demonstrate that FT-Transformer generally outperforms traditional ML models, previous approaches and TabTransformer across multiple classification levels (OS family, major, and minor versions). The results establish a strong foundation for DL-based OS fingerprinting, improving accuracy and adaptability in complex network environments. Furthermore, we ensure the reproducibility of our research by providing an open-source implementation.

classification, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.09084

Country: Europe > Spain (0.46)

Genre: Research Report (1.00)

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning

Thielmann, Anton Frederik, Samiee, Soheila

arXiv.org Artificial IntelligenceNov-26-2024

Recent advancements in tabular deep learning (DL) have led to substantial performance improvements, surpassing the capabilities of traditional models. With the adoption of techniques from natural language processing (NLP), such as language model-based approaches, DL models for tabular data have also grown in complexity and size. Although tabular datasets do not typically pose scalability issues, the escalating size of these models has raised efficiency concerns. Despite its importance, efficiency has been relatively underexplored in tabular DL research. This paper critically examines the latest innovations in tabular DL, with a dual focus on performance and computational efficiency.

architecture, arxiv preprint arxiv, efficiency, (13 more...)

arXiv.org Artificial Intelligence

2411.17207

Country:

North America > Canada (0.05)
North America > United States > California (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mambular: A Sequential Model for Tabular Deep Learning

Thielmann, Anton Frederik, Kumar, Manish, Weisser, Christoph, Reuter, Arik, Säfken, Benjamin, Samiee, Soheila

arXiv.org Artificial IntelligenceAug-12-2024

The analysis of tabular data has traditionally been dominated by gradient-boosted decision trees (GBDTs), known for their proficiency with mixed categorical and numerical features. However, recent deep learning innovations are challenging this dominance. We introduce Mambular, an adaptation of the Mamba architecture optimized for tabular data. We extensively benchmark Mambular against state-of-the-art models, including neural networks and tree-based methods, and demonstrate its competitive performance across diverse datasets. Additionally, we explore various adaptations of Mambular to understand its effectiveness for tabular data. We investigate different pooling strategies, feature interaction mechanisms, and bi-directional processing. Our analysis shows that interpreting features as a sequence and passing them through Mamba layers results in surprisingly performant models.

architecture, dataset, mambular, (14 more...)

arXiv.org Artificial Intelligence

2408.06291

Country:

North America > United States > California (0.05)
North America > Canada (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PTaRL: Prototype-based Tabular Representation Learning via Space Calibration

Ye, Hangting, Fan, Wei, Song, Xiaozhuang, Zheng, Shun, Zhao, He, Guo, Dandan, Chang, Yi

arXiv.org Artificial IntelligenceJul-15-2024

Tabular data have been playing a mostly important role in diverse real-world fields, such as healthcare, engineering, finance, etc. With the recent success of deep learning, many tabular machine learning (ML) methods based on deep networks (e.g., Transformer, ResNet) have achieved competitive performance on tabular benchmarks. However, existing deep tabular ML methods suffer from the representation entanglement and localization, which largely hinders their prediction performance and leads to performance inconsistency on tabular tasks. To overcome these problems, we explore a novel direction of applying prototype learning for tabular ML and propose a prototype-based tabular representation learning framework, PTaRL, for tabular prediction tasks. The core idea of PTaRL is to construct prototype-based projection space (P-Space) and learn the disentangled representation around global data prototypes. Specifically, PTaRL mainly involves two stages: (i) Prototype Generation, that constructs global prototypes as the basis vectors of P-Space for representation, and (ii) Prototype Projection, that projects the data samples into P-Space and keeps the core global data information via Optimal Transport. Then, to further acquire the disentangled representations, we constrain PTaRL with two strategies: (i) to diversify the coordinates towards global prototypes of different representations within P-Space, we bring up a diversification constraint for representation calibration; (ii) to avoid prototype entanglement in P-Space, we introduce a matrix orthogonalization constraint to ensure the independence of global prototypes. Finally, we conduct extensive experiments in PTaRL coupled with state-of-the-art deep tabular ML models on various tabular benchmarks and the results have shown our consistent superiority.

p-space, prototype, representation, (14 more...)

arXiv.org Artificial Intelligence

2407.05364

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tabular Data: Is Attention All You Need?

Zabërgja, Guri, Kadra, Arlind, Grabocka, Josif

arXiv.org Artificial IntelligenceFeb-6-2024

Deep Learning has revolutionized the field of AI and led to remarkable achievements in applications involving image and text data. Unfortunately, there is inconclusive evidence on the merits of neural networks for structured tabular data. In this paper, we introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on tabular data, but also transformer-based architectures against traditional multi-layer perceptrons (MLP) with residual connections. In contrast to prior work, our empirical findings indicate that neural networks are competitive against decision trees. Furthermore, we assess that transformer-based architectures do not outperform simpler variants of traditional MLP architectures on tabular datasets. As a result, this paper helps the research and practitioner communities make informed choices on deploying neural networks on future tabular data applications.

architecture, dataset, hyperparameter, (15 more...)

arXiv.org Artificial Intelligence

2402.0397

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data

Pleiter, Bart, Tajalli, Behrad, Koffas, Stefanos, Abad, Gorka, Xu, Jing, Larson, Martha, Picek, Stjepan

arXiv.org Artificial IntelligenceJan-29-2024

Deep Neural Networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions.More recently, DNNs for tabular data have gained increasing attention due to the rise of transformer models. Our research presents a comprehensive analysis of backdoor attacks on tabular data using DNNs, particularly focusing on transformers. Given the inherent complexities of tabular data, we explore the challenges of embedding backdoors. Through systematic experimentation across benchmark datasets, we uncover that transformer-based DNNs for tabular data are highly susceptible to backdoor attacks, even with minimal feature value alterations. We also verify that our attack can be generalized to other models, like XGBoost and DeepFM. Our results indicate nearly perfect attack success rates (approximately 100%) by introducing novel backdoor attack strategies to tabular data. Furthermore, we evaluate several defenses against these attacks, identifying Spectral Signatures as the most effective one. Our findings highlight the urgency of addressing such vulnerabilities and provide insights into potential countermeasures for securing DNN models against backdoors in tabular data.

backdoor attack, dataset, tabular data, (14 more...)

arXiv.org Artificial Intelligence

2311.0755

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unveiling the Power of Self-Attention for Shipping Cost Prediction: The Rate Card Transformer

Sreekar, P Aditya, Verma, Sahil, Madhavan, Varun, Persad, Abhishek

arXiv.org Machine LearningNov-20-2023

Amazon ships billions of packages to its customers annually within the United States. Shipping cost of these packages are used on the day of shipping (day 0) to estimate profitability of sales. Downstream systems utilize these days 0 profitability estimates to make financial decisions, such as pricing strategies and delisting loss-making products. However, obtaining accurate shipping cost estimates on day 0 is complex for reasons like delay in carrier invoicing or fixed cost components getting recorded at monthly cadence. Inaccurate shipping cost estimates can lead to bad decision, such as pricing items too low or high, or promoting the wrong product to the customers. Current solutions for estimating shipping costs on day 0 rely on tree-based models that require extensive manual engineering efforts. In this study, we propose a novel architecture called the Rate Card Transformer (RCT) that uses self-attention to encode all package shipping information such as package attributes, carrier information and route plan. Unlike other transformer-based tabular models, RCT has the ability to encode a variable list of one-to-many relations of a shipment, allowing it to capture more information about a shipment. For example, RCT can encode properties of all products in a package. Our results demonstrate that cost predictions made by the RCT have 28.82% less error compared to tree-based GBDT model. Moreover, the RCT outperforms the state-of-the-art transformer-based tabular model, FTTransformer, by 6.08%. We also illustrate that the RCT learns a generalized manifold of the rate card that can improve the performance of tree-based models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2311.11694

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)
Asia > India > West Bengal > Kharagpur (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Marine (1.00)
Transportation > Freight & Logistics Services > Shipping (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Revisiting Deep Learning Models for Tabular Data

Gorishniy, Yury, Rubachev, Ivan, Khrulkov, Valentin, Babenko, Artem

arXiv.org Artificial IntelligenceOct-26-2023

The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.

dataset, ft-transformer, latexit sha1, (16 more...)

arXiv.org Artificial Intelligence

2106.11959

Country:

North America > United States > California (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback