Goto

Collaborating Authors

 saint


Denoising diffusion networks for normative modeling in neuroimaging

arXiv.org Machine Learning

Normative modeling estimates reference distributions of biological measures conditional on covariates, enabling centiles and clinically interpretable deviation scores to be derived. Most neuroimaging pipelines fit one model per imaging-derived phenotype (IDP), which scales well but discards multivariate dependence that may encode coordinated patterns. We propose denoising diffusion probabilistic models (DDPMs) as a unified conditional density estimator for tabular IDPs, from which univariate centiles and deviation scores are derived by sampling. We utilise two denoiser backbones: (i) a feature-wise linear modulation (FiLM) conditioned multilayer perceptron (MLP) and (ii) a tabular transformer with feature self-attention and intersample attention (SAINT), conditioning covariates through learned embeddings. We evaluate on a synthetic benchmark with heteroscedastic and multimodal age effects and on UK Biobank FreeSurfer phenotypes, scaling from dimension of 2 to 200. Our evaluation suite includes centile calibration (absolute centile error, empirical coverage, and the probability integral transform), distributional fidelity (Kolmogorov-Smirnov tests), multivariate dependence diagnostics, and nearest-neighbour memorisation analysis. For low dimensions, diffusion models deliver well-calibrated per-IDP outputs comparable to traditional baselines while jointly modeling realistic dependence structure. At higher dimensions, the transformer backbone remains substantially better calibrated than the MLP and better preserves higher-order dependence, enabling scalable joint normative models that remain compatible with standard per-IDP pipelines. These results support diffusion-based normative modeling as a practical route to calibrated multivariate deviation profiles in neuroimaging.


Hot AI Jesus Is Huge on Facebook

The Atlantic - Technology

Jesus is punching the devil on Facebook. The two are in a boxing ring. Jesus is wearing a pair of white boxing shorts with his name embroidered on the waistband. He is ripped beyond belief; not only does he have six-pack abs, but every muscle on his body is bulging. Jesus is hitting the devil directly on the chin, a knockout blow.


Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data

arXiv.org Artificial Intelligence

Deep Neural Networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions.More recently, DNNs for tabular data have gained increasing attention due to the rise of transformer models. Our research presents a comprehensive analysis of backdoor attacks on tabular data using DNNs, particularly focusing on transformers. Given the inherent complexities of tabular data, we explore the challenges of embedding backdoors. Through systematic experimentation across benchmark datasets, we uncover that transformer-based DNNs for tabular data are highly susceptible to backdoor attacks, even with minimal feature value alterations. We also verify that our attack can be generalized to other models, like XGBoost and DeepFM. Our results indicate nearly perfect attack success rates (approximately 100%) by introducing novel backdoor attack strategies to tabular data. Furthermore, we evaluate several defenses against these attacks, identifying Spectral Signatures as the most effective one. Our findings highlight the urgency of addressing such vulnerabilities and provide insights into potential countermeasures for securing DNN models against backdoors in tabular data.


Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking

arXiv.org Artificial Intelligence

Despite groundbreaking success in image and text learning, deep learning has not achieved significant improvements against traditional machine learning (ML) when it comes to tabular data. This performance gap underscores the need for data-centric treatment and benchmarking of learning algorithms. Recently, attention and contrastive learning breakthroughs have shifted computer vision and natural language processing paradigms. However, the effectiveness of these advanced deep models on tabular data is sparsely studied using a few data sets with very large sample sizes, reporting mixed findings after benchmarking against a limited number of baselines. We argue that the heterogeneity of tabular data sets and selective baselines in the literature can bias the benchmarking outcomes. This article extensively evaluates state-of-the-art attention and contrastive learning methods on a wide selection of 28 tabular data sets (14 easy and 14 hard-to-classify) against traditional deep and machine learning. Our data-centric benchmarking demonstrates when traditional ML is preferred over deep learning and vice versa because no best learning method exists for all tabular data sets. Combining between-sample and between-feature attentions conquers the invincible traditional ML on tabular data sets by a significant margin but fails on high dimensional data, where contrastive learning takes a robust lead. While a hybrid attention-contrastive learning strategy mostly wins on hard-to-classify data sets, traditional methods are frequently superior on easy-to-classify data sets with presumably simpler decision boundaries. To the best of our knowledge, this is the first benchmarking paper with statistical analyses of attention and contrastive learning performances on a diverse selection of tabular data sets against traditional deep and machine learning baselines to facilitate further advances in this field.


Tabular Machine Learning Methods for Predicting Gas Turbine Emissions

arXiv.org Artificial Intelligence

Predicting emissions for gas turbines is critical for monitoring harmful pollutants being released into the atmosphere. In this study, we evaluate the performance of machine learning models for predicting emissions for gas turbines. We compare an existing predictive emissions model, a first principles-based Chemical Kinetics model, against two machine learning models we developed based on SAINT and XGBoost, to demonstrate improved predictive performance of nitrogen oxides (NOx) and carbon monoxide (CO) using machine learning techniques. Our analysis utilises a Siemens Energy gas turbine test bed tabular dataset to train and validate the machine learning models. Additionally, we explore the trade-off between incorporating more features to enhance the model complexity, and the resulting presence of increased missing values in the dataset.


SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

arXiv.org Machine Learning

Classical approaches to solving tabular problems, such as gradient boosting and random forests, are widely used by practitioners. However, recent deep learning methods have achieved a degree of performance competitive with popular techniques. We devise a hybrid deep learning approach to solving tabular data problems. Our method, SAINT, performs attention over both rows and columns, and it includes an enhanced embedding method. We also study a new contrastive self-supervised pre-training method for use when labels are scarce. SAINT consistently improves performance over previous deep learning methods, and it even outperforms gradient boosting methods, including XGBoost, CatBoost, and LightGBM, on average over a variety of benchmark tasks.


MUSE: Multi-Scale Temporal Features Evolution for Knowledge Tracing

arXiv.org Artificial Intelligence

Transformer based knowledge tracing model is an extensively studied problem in the field of computer-aided education. By integrating temporal features into the encoder-decoder structure, transformers can processes the exercise information and student response information in a natural way. However, current state-of-the-art transformer-based variants still share two limitations. First, extremely long temporal features cannot well handled as the complexity of self-attention mechanism is O(n2). Second, existing approaches track the knowledge drifts under fixed a window size, without considering different temporal-ranges. To conquer these problems, we propose MUSE, which is equipped with multi-scale temporal sensor unit, that takes either local or global temporal features into consideration. The proposed model is capable to capture the dynamic changes in users knowledge states at different temporal-ranges, and provides an efficient and powerful way to combine local and global features to make predictions. Our method won the 5-th place over 3,395 teams in the Riiid AIEd Challenge 2020.


SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

arXiv.org Artificial Intelligence

We propose SAINT+, a successor of SAINT which is a Transformer based knowledge tracing model that separately processes exercise information and student response information. Following the architecture of SAINT, SAINT+ has an encoder-decoder structure where the encoder applies self-attention layers to a stream of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to streams of response embeddings and encoder output. Moreover, SAINT+ incorporates two temporal feature embeddings into the response embeddings: elapsed time, the time taken for a student to answer, and lag time, the time interval between adjacent learning activities. We empirically evaluate the effectiveness of SAINT+ on EdNet, the largest publicly available benchmark dataset in the education domain. Experimental results show that SAINT+ achieves state-of-the-art performance in knowledge tracing with an improvement of 1.25% in area under receiver operating characteristic curve compared to SAINT, the current state-of-the-art model in EdNet dataset.


Saints' Michael Burton active for game against Lions after false positive COVID-19 test

FOX News

New Orleans Saints' fullback Michael Burton will be active for Sunday's game against the Detroit Lions just one day after receiving a false positive COVID-19 test result. Burton tested positive on Saturday night signaling trouble for the league already dealing with an outbreak and several other isolated cases among teams but a re-test on Sunday morning turned back a negative test result, The Athletic reported. Burton and other Saints players also underwent rapid testing which all came back negative giving them a green light to carry on with the Lions game as scheduled. The NFL has been forced to postpone two games and adjust team schedules after the Tennessee Titans had around 20 people - 10 players and 10 personnel - test positive this past week. The Titans-Pittsburgh Steelers game, originally scheduled for Sunday, was postponed until Oct. 25 -- during Tennessee's bye.


What makes a remake? Not even developers really know.

Washington Post - Technology News

However, the level of rebuilding undertaken by the developers exceeded the usual expectations of a remaster. While "Saint's Row: The Third Remastered" largely kept the game's mechanics the same, the scale of the graphical overhaul was akin to that of a new game. A traditional remaster is usually relatively straightforward, and involves cleaning up, improving, and tweaking existing assets. In "Saint's Row: The Third Remastered," however, a lot of the original assets were thrown out, with new (but identical-looking) ones built in their place. For example, each vehicle had new individual undercarriages, engines, interiors, lights, grills, and bumpers redesigned from scratch for a new engine, designed to look like a more polished version of the original model.