AITopics | Klein, Samuel

Collaborating Authors

Klein, Samuel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing generalization in high energy physics using white-box adversarial attacks

Rothen, Franck, Klein, Samuel, Leigh, Matthew, Golling, Tobias

arXiv.org Artificial IntelligenceNov-26-2024

Machine learning is becoming increasingly popular in the context of particle physics. Supervised learning, which uses labeled Monte Carlo (MC) simulations, remains one of the most widely used methods for discriminating signals beyond the Standard Model. However, this paper suggests that supervised models may depend excessively on artifacts and approximations from Monte Carlo simulations, potentially limiting their ability to generalize well to real data. This study aims to enhance the generalization properties of supervised models by reducing the sharpness of local minima. It reviews the application of four distinct white-box adversarial attacks in the context of classifying Higgs boson decay signals. The attacks are divided into weight space attacks, and feature space attacks. To study and quantify the sharpness of different local minima this paper presents two analysis methods: gradient ascent and reduced Hessian eigenvalue analysis. The results show that white-box adversarial attacks significantly improve generalization performance, albeit with increased computational complexity.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.09296

Country:

North America > United States > California (0.14)
Europe > Switzerland > Geneva > Geneva (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (0.81)
Government > Military (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models

Heinrich, Lukas, Golling, Tobias, Kagan, Michael, Klein, Samuel, Leigh, Matthew, Osadchy, Margarita, Raine, John Andrew

arXiv.org Artificial IntelligenceJan-25-2024

These models also represent a scale in both model size and data size that have not been addressed in HEP. In this work, we aim to take the first steps towards building such While Artificial Intelligence (AI) and Machine Learning a HEP foundation model, focusing on developing HEP (ML) are already playing a major role in the analysis of data specific SSL strategies, whilst keeping an eye on how high energy physics (HEP) data, the HEP community well such strategies may scale in the future. We propose a has yet to benefit from the self-supervised learning (SSL) masked particle modeling (MPM) scheme, akin to masked based approaches to building large foundation models language modeling (MLM) in NLP, for self-supervised (FM) [1] that have been pioneered in natural language learning on unlabeled data consisting of sets of particles processing (NLP) [2-5] and computer vision (CV) [6-8]. in a collider physics environment. In doing so, we propose These modern approaches use SSL to pre-train models a novel scheme to apply masked modeling strategies to on vast data sets in order to learn generic representations unordered sets of inputs. of the data. Such models can then be efficiently finetuned with small datasets for a variety of downstream This work aims to generalize the language-inspired tasks. The self-supervised pre-training of a FM produces MLM-type training scheme to HEP scientific data. The a model that is also referred to as the "backbone", as it paradigm involves extracting semantic meaning and understanding can serve as the information extraction component for of the whole by predicting the missing (masked) downstream models. This concept significantly expands pieces, referred to as tokens, thereby considering the collective the possibilities for learning robust and meaningful data impact of individual input elements.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2401.13537

Country:

North America > United States (0.93)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Industry:

Energy (0.68)
Education (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

Add feedback

Improving new physics searches with diffusion models for event observables and jet constituents

Sengupta, Debajyoti, Leigh, Matthew, Raine, John Andrew, Klein, Samuel, Golling, Tobias

arXiv.org Artificial IntelligenceDec-19-2023

We introduce a new technique called Drapes to enhance the sensitivity in searches for new physics at the LHC. By training diffusion models on side-band data, we show how background templates for the signal region can be generated either directly from noise, or by partially applying the diffusion process to existing data. In the partial diffusion case, data can be drawn from side-band regions, with the inverse diffusion performed for new target conditional values, or from the signal region, preserving the distribution over the conditional property that defines the signal region. We apply this technique to the hunt for resonances using the LHCO di-jet dataset, and achieve state-of-the-art performance for background template generation using high level input features. We also show how Drapes can be applied to low level inputs with jet constituents, reducing the model dependence on the choice of input observables. Using jet constituents we can further improve sensitivity to the signal process, but observe a loss in performance where the signal significance before applying any selection is below 4$\sigma$.

classifier, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2312.1013

Country: Europe > Switzerland > Geneva > Geneva (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multimodal Neurons in Pretrained Text-Only Transformers

Schwettmann, Sarah, Chowdhury, Neil, Klein, Samuel, Bau, David, Torralba, Antonio

arXiv.org Artificial IntelligenceOct-1-2023

Language models demonstrate remarkable capacity to generalize representations learned in one modality to downstream tasks in other modalities. Can we trace this ability to individual neurons? We study the case where a frozen text transformer is augmented with vision using a self-supervised visual encoder and a single linear projection learned on an image-to-text task. Outputs of the projection layer are not immediately decodable into language describing image content; instead, we find that translation between modalities occurs deeper within the transformer. We introduce a procedure for identifying "multimodal neurons" that convert visual representations into corresponding text, and decoding the concepts they inject into the model's residual stream. In a series of experiments, we show that multimodal neurons operate on specific visual concepts across inputs, and have a systematic causal effect on image captioning.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2308.01544

Country:

Europe > United Kingdom > England (0.46)
North America > United States (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Flows for Flows: Morphing one Dataset into another with Maximum Likelihood Estimation

Golling, Tobias, Klein, Samuel, Mastandrea, Radha, Nachman, Benjamin, Raine, John Andrew

arXiv.org Artificial IntelligenceSep-12-2023

Many components of data analysis in high energy physics and beyond require morphing one dataset into another. This is commonly solved via reweighting, but there are many advantages of preserving weights and shifting the data points instead. Normalizing flows are machine learning models with impressive precision on a variety of particle physics tasks. Naively, normalizing flows cannot be used for morphing because they require knowledge of the probability density of the starting dataset. In most cases in particle physics, we can generate more examples, but we do not know densities explicitly. We propose a protocol called flows for flows for training normalizing flows to morph one dataset into another even if the underlying probability density of neither dataset is known explicitly. This enables a morphing strategy trained with maximum likelihood estimation, a setup that has been shown to be highly effective in related tasks. We study variations on this protocol to explore how far the data points are moved to statistically match the two datasets. Furthermore, we show how to condition the learned flows on particular features in order to create a morphing function for every value of the conditioning feature. For illustration, we demonstrate flows for flows for toy examples as well as a collider physics example involving dijet events

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevD.108.096018

2309.06472

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Switzerland > Geneva > Geneva (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

CURTAINs Flows For Flows: Constructing Unobserved Regions with Maximum Likelihood Estimation

Sengupta, Debajyoti, Klein, Samuel, Raine, John Andrew, Golling, Tobias

arXiv.org Artificial IntelligenceMay-8-2023

Model independent techniques for constructing background data templates using generative models have shown great promise for use in searches for new physics processes at the LHC. We introduce a major improvement to the CURTAINs method by training the conditional normalizing flow between two side-band regions using maximum likelihood estimation instead of an optimal transport loss. The new training objective improves the robustness and fidelity of the transformed data and is much faster and easier to train. We compare the performance against the previous approach and the current state of the art using the LHC Olympics anomaly detection dataset, where we see a significant improvement in sensitivity over the original CURTAINs method. Furthermore, CURTAINsF4F requires substantially less computational resources to cover a large number of signal regions than other fully data driven approaches. When using an efficient configuration, an order of magnitude more models can be trained in the same time required for ten signal regions, without a significant drop in performance.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.04646

Genre: Research Report (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Flowification: Everything is a Normalizing Flow

Máté, Bálint, Klein, Samuel, Golling, Tobias, Fleuret, François

arXiv.org Artificial IntelligenceJan-26-2023

The two key characteristics of a normalizing flow is that it is invertible (in particular, dimension preserving) and that it monitors the amount by which it changes the likelihood of data points as samples are propagated along the network. Recently, multiple generalizations of normalizing flows have been introduced that relax these two conditions. On the other hand, neural networks only perform a forward pass on the input, there is neither a notion of an inverse of a neural network nor is there one of its likelihood contribution. In this paper we argue that certain neural network architectures can be enriched with a stochastic inverse pass and that their likelihood contribution can be monitored in a way that they fall under the generalized notion of a normalizing flow mentioned above. We term this enrichment flowification. We prove that neural networks only containing linear layers, convolutional layers and invertible activations such as LeakyReLU can be flowified and evaluate them in the generative setting on image datasets.

artificial intelligence, machine learning, rqspline, (16 more...)

arXiv.org Artificial Intelligence

2205.15209

Genre:

Research Report (0.65)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Decorrelation with conditional normalizing flows

Klein, Samuel, Golling, Tobias

arXiv.org Artificial IntelligenceDec-15-2022

The sensitivity of many physics analyses can be enhanced by constructing discriminants that preferentially select signal events. Such discriminants become much more useful if they are uncorrelated with a set of protected attributes. In this paper we show that a normalizing flow conditioned on the protected attributes can be used to find a decorrelated representation for any discriminant. As a normalizing flow is invertible the separation power of the resulting discriminant will be unchanged at any fixed value of the protected attributes. We demonstrate the efficacy of our approach by building supervised jet taggers that produce almost no sculpting in the mass distribution of the background.

artificial intelligence, discriminant, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2211.02486

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Funnels: Exact maximum likelihood with dimensionality reduction

Klein, Samuel, Raine, John A., Pina-Otey, Sebastian, Voloshynovskiy, Slava, Golling, Tobias

arXiv.org Machine LearningDec-15-2021

Normalizing flows are diffeomorphic, typically dimension-preserving, models trained using the likelihood of the model. We use the SurVAE framework to construct dimension reducing surjective flows via a new layer, known as the funnel. We demonstrate its efficacy on a variety of datasets, and show it improves upon or matches the performance of existing flows while having a reduced latent space size. The funnel layer can be constructed from a wide range of transformations including restricted convolution and feed forward layers.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2112.08069

Genre: Research Report (0.50)

Add feedback

Toward a Visual Concept Vocabulary for GAN Latent Space

Schwettmann, Sarah, Hernandez, Evan, Bau, David, Klein, Samuel, Andreas, Jacob, Torralba, Antonio

arXiv.org Artificial IntelligenceOct-8-2021

A large body of recent work has identified transformations in the latent spaces of generative adversarial networks (GANs) that consistently and interpretably transform generated images. But existing techniques for identifying these transformations rely on either a fixed vocabulary of pre-specified visual concepts, or on unsupervised disentanglement techniques whose alignment with human judgments about perceptual salience is unknown. This paper introduces a new method for building open-ended vocabularies of primitive visual concepts represented in a GAN's latent space. Our approach is built from three components: (1) automatic identification of perceptually salient directions based on their layer selectivity; (2) human annotation of these directions with free-form, compositional natural language descriptions; and (3) decomposition of these annotations into a visual concept vocabulary, consisting of distilled directions labeled with single words. Experiments show that concepts learned with our approach are reliable and composable -- generalizing across classes, contexts, and observers, and enabling fine-grained manipulation of image style and content.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2110.04292

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback