Learning to Validate Generative Models: a Goodness-of-Fit Approach

Cappelli, Pietro, Grosso, Gaia, Letizia, Marco, Reyes-González, Humberto, Zanetti, Marco

Nov-13-2025–arXiv.org Machine Learning

Generative models are increasingly central to scientific workflows, yet their systematic use and interpretation require a proper understanding of their limitations through rigorous validation. Classic approaches struggle with scalability, statistical power, or interpretability when applied to high-dimensional data, making it difficult to certify the reliability of these models in realistic, high-dimensional scientific settings. Here, we propose the use of the New Physics Learning Machine (NPLM), a learning based approach to goodness-of-fit testing inspired by the Neyman-Pearson construction, to test generative networks trained on high-dimensional scientific data. We demonstrate the performance of NPLM for validation in two benchmark cases: generative models trained on mixtures of Gaussian models with increasing dimensionality, and a public end-to-end generator for the Large Hadron Collider called FlashSim, trained on jet data, typical in the field of high-energy physics. We demonstrate that the NPLM can serve as a powerful validation method while also providing a means to diagnose sub-optimally modeled regions of the data.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

Nov-13-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language > Generation (1.00)
  - Machine Learning > Performance Analysis
    - Accuracy (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found