Adversarial domain adaptation to reduce sample bias of a high energy physics classifier
Clavijo, Jose M., Glaysher, Paul, Katzy, Judith M.
Many measurements and searches for new phenomena performed by the experiments at the Large Hadron Collider (LHC) use a classification algorithm, such as Boosted Decision Trees or Neural Networks, to discriminate the physics process of interest (signal) from other physics processes with similar signature (background). The algorithms are optimized using supervised training on detailed simulated Monte Carlo (MC) data sets, labeled as signal or background. The resulting classifier is applied to unlabeled data to separate signal and background, and measure the statistical significance of the signal or its strength, assuming that the simulated and the real data sets are identically distributed. However, differences between real and simulated data sets always exist and the learner may pick up a discriminating feature which differs between the data sets, introducing a bias to the sample used for training. This problem is similar to that of visual recognition where training is performed on simulated pictures, the so-called source domain and applied to real photographs, the target domain. In order to avoid training specific to the source domain, algorithms of domain adaptation have been developed. In this paper, we apply the method of domain adaptation to high energy physics data. In this paper we present a Domain Adversarial Neural Network (DANN) to classify events in the search for the t tH(H b b) process at the LHC, which is very rare and hard to separate from the t t jets background [1].
May-1-2020
- Genre:
- Research Report (1.00)
- Industry:
- Education > Curriculum > Subject-Specific Education (0.40)
- Technology: