SURFing to the Fundamental Limit of Jet Tagging

Pang, Ian, Faroughy, Darius A., Shih, David, Das, Ranit, Kasieczka, Gregor

Nov-21-2025–arXiv.org Artificial Intelligence

Jet tagging is a central task in collider physics. Over the past decade, machine learning has driven major advances in jet tagging, with increasingly sophisticated architectures achieving very high classification performance on simulated datasets [1-11]. This success naturally raises a key question: have current jet taggers already reached the fundamental limit of jet tagging, or does a gap remain between practical performance and the true statistical optimum? The Neyman-Pearson (NP) limit, defined by the likelihood ratio, is the best possible discriminant between two different underlying physics processes - such as top and QCD jets - that any classifier could achieve if it had access to the exact data likelihoods [12]. In practice, however, this limit cannot be evaluated directly because the true likelihood of the data-generating process is unknown. It therefore remains unclear how close existing classifiers are to this ultimate bound. Recently, Ref. [13] proposed using autoregressive GPT-style generative models to probe this limit for top vs. QCD jets from the JetClass dataset [14]. These models operate on discretized, tokenized representations of jet constituents and yield explicit log-likelihoods, enabling the computation of likelihood ratios between jet classes.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Nov-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.67)

Genre:
- Research Report (1.00)

Industry:
- Energy (0.67)
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found