Predicting the Higgs-Boson Signal
The Higgs Boson is a landmark discovery that will help us to understand the basic nature of the universe. It was discovered first by the ATLAS experiment at the Large Hadron Collider, CERN in 2012. The Higg's Boson decays into two tau particles giving rise to a small signal buried in background noise. The goal of the Higgs Boson Machine Learning Challenge was to classify the characterizing events detected by ATLAS into "tau tau decay of a Higgs boson" versus "background." First step was to analyze the data and look for Missingness in the data. We found that the missing columns have some interesting pattern and they depend on the columns "PRI_jet_column", which is the number of jets having integer values of 0,1,2, or 3 where larger values has been caped at 3. The Jets are the experimental signatures of quarks and gluons produced in high-energy processes such as head-on proton-proton collisions. For PRI_jet_column 0, there were 10 columns having NULL values (-999), these are the columns which describe the Jet when it is equal to 0. For example, "DER_mass_jet_jet", the invariant mass (20) of the two jets (undefined if PRI jet num 1).So, it does not make sense to take into account the attributes of the jet(s), since they don't exist. For "PRI_jet_column" 1, there were 7 columns having NULL values and they describe the jets when their number is 2, So we deleted these 7 columns. For "PRI_jet_column" 2 or 3, we did not delete any columns.
Nov-28-2016, 00:50:04 GMT
- Technology: