Jia, Hao
Conditioned quantum-assisted deep generative surrogate for particle-calorimeter interactions
Toledo-Marin, J. Quetzalcoatl, Gonzalez, Sebastian, Jia, Hao, Lu, Ian, Sogutlu, Deniz, Abhishek, Abhishek, Gay, Colin, Paquet, Eric, Melko, Roger, Fox, Geoffrey C., Swiatlowski, Maximilian, Fedorko, Wojciech
Particle collisions at accelerators such as the Large Hadron Collider, recorded and analyzed by experiments such as ATLAS and CMS, enable exquisite measurements of the Standard Model and searches for new phenomena. Simulations of collision events at these detectors have played a pivotal role in shaping the design of future experiments and analyzing ongoing ones. However, the quest for accuracy in Large Hadron Collider (LHC) collisions comes at an imposing computational cost, with projections estimating the need for millions of CPU-years annually during the High Luminosity LHC (HL-LHC) run \cite{collaboration2022atlas}. Simulating a single LHC event with \textsc{Geant4} currently devours around 1000 CPU seconds, with simulations of the calorimeter subdetectors in particular imposing substantial computational demands \cite{rousseau2023experimental}. To address this challenge, we propose a conditioned quantum-assisted deep generative model. Our model integrates a conditioned variational autoencoder (VAE) on the exterior with a conditioned Restricted Boltzmann Machine (RBM) in the latent space, providing enhanced expressiveness compared to conventional VAEs. The RBM nodes and connections are meticulously engineered to enable the use of qubits and couplers on D-Wave's Pegasus-structured \textit{Advantage} quantum annealer (QA) for sampling. We introduce a novel method for conditioning the quantum-assisted RBM using \textit{flux biases}. We further propose a novel adaptive mapping to estimate the effective inverse temperature in quantum annealers. The effectiveness of our framework is illustrated using Dataset 2 of the CaloChallenge \cite{calochallenge}.
Zephyr quantum-assisted hierarchical Calo4pQVAE for particle-calorimeter interactions
Lu, Ian, Jia, Hao, Gonzalez, Sebastian, Sogutlu, Deniz, Toledo-Marin, J. Quetzalcoatl, Hoque, Sehmimul, Abhishek, Abhishek, Gay, Colin, Melko, Roger, Paquet, Eric, Fox, Geoffrey, Swiatlowski, Maximilian, Fedorko, Wojciech
With the approach of the High Luminosity Large Hadron Collider (HL-LHC) era set to begin particle collisions by the end of this decade, it is evident that the computational demands of traditional collision simulation methods are becoming increasingly unsustainable. Existing approaches, which rely heavily on first-principles Monte Carlo simulations for modeling event showers in calorimeters, are projected to require millions of CPU-years annually -- far exceeding current computational capacities. This bottleneck presents an exciting opportunity for advancements in computational physics by integrating deep generative models with quantum simulations. We propose a quantum-assisted hierarchical deep generative surrogate founded on a variational autoencoder (VAE) in combination with an energy conditioned restricted Boltzmann machine (RBM) embedded in the model's latent space as a prior. By mapping the topology of D-Wave's Zephyr quantum annealer (QA) into the nodes and couplings of a 4-partite RBM, we leverage quantum simulation to accelerate our shower generation times significantly. To evaluate our framework, we use Dataset 2 of the CaloChallenge 2022. Through the integration of classical computation and quantum simulation, this hybrid framework paves way for utilizing large-scale quantum simulations as priors in deep generative models.
CaloQVAE : Simulating high-energy particle-calorimeter interactions using hybrid quantum-classical generative models
Hoque, Sehmimul, Jia, Hao, Abhishek, Abhishek, Fadaie, Mojde, Toledo-Marín, J. Quetzalcoatl, Vale, Tiago, Melko, Roger G., Swiatlowski, Maximilian, Fedorko, Wojciech T.
Department of Physics and Astronomy, University of Waterloo, Ontario N2L 3G1, Canada The Large Hadron Collider's high luminosity era presents major computational challenges in the analysis of collision events. Large amounts of Monte Carlo (MC) simulation will be required to constrain the statistical uncertainties of the simulated datasets below these of the experimental data. Modelling of high-energy particles propagating through the calorimeter section of the detector is the most computationally intensive MC simulation task. We introduce a technique combining recent advancements in generative models and quantum annealing for fast and efficient simulation of high-energy particle-calorimeter interactions. The Large Hadron Collider (LHC) is the highest energy particle showers is critical to enable the highest quality particle accelerator in the world, and currently collides measurements, but simulating each shower from first protons at s = 13.6 TeV at a rate of 2 10 We deploy a restricted "High-Luminosity LHC" (HL-LHC) dataset will enable Boltzmann machine (RBM) to encode a rich description significantly more precise measurements of the Higgs boson of particle showers in detectors, and use quantum and other Standard Model particles.
Bilingual Terminology Extraction from Comparable E-Commerce Corpora
Jia, Hao, Gu, Shuqin, Zhang, Yuqi, Duan, Xiangyu
Bilingual terminologies are important machine translation resources in the field of e-commerce, which are usually either manually translated or automatically extracted from parallel data. The human translation is costly and e-commerce parallel corpora is very scarce. However, the comparable data in different languages in the same commodity field is abundant. In this paper, we propose a novel framework of extracting e-commercial bilingual terminologies from comparable data. Benefiting from the cross-lingual pre-training in e-commerce, our framework can make full use of the deep semantic relationship between source-side terminology and target-side sentence to extract corresponding target terminology. Experimental results on various language pairs show that our approaches achieve significantly better performance than various strong baselines.