Not enough data to create a plot.
Try a different view from the menu above.
Peng, Cheng
Machine learning enabled experimental design and parameter estimation for ultrafast spin dynamics
Chen, Zhantao, Peng, Cheng, Petsch, Alexander N., Chitturi, Sathya R., Okullo, Alana, Chowdhury, Sugata, Yoon, Chun Hong, Turner, Joshua J.
Ever since the discovery of x-rays, considerable breakthroughs have been made using them as a probe of matter, from testing models of the atom to solving the structure of deoxyribonucleic acid (DNA). Over the last few decades with the proliferation of synchrotron x-ray sources around the world, the application to many scientific fields has progressed tremendously and allowed studies of complicated structures and phenomena like protein dynamics and crystallography [1, 2], electronic structures of strongly correlated materials [3, 4], and a wide variety of elementary excitations [5, 6]. With the the development of the next generation of light sources, especially the x-ray free electron lasers (X-FEL) [7, 8], not only have discoveries accelerated, but completely novel techniques have been developed and new fields of science have emerged, such as laboratory astrophysics [9, 10, 11, 12] and single particle diffractive imaging [13, 14, 15]. Among these emerging techniques brought by X-FELs, the development of x-ray photon fluctuation spectroscopy (XPFS) holds particular relevance for condensed matter and material physics [16]. XPFS is a unique and powerful approach that opens up numerous opportunities to probe ultrafast dynamics of timescales corresponding to the µeV to meV-energy level. As the high-level coherence of the x-ray beam encodes subtle changes in the system at these timescales, XPFS is capable of investigating fluctuations of elementary excitations, such as that of the spin [17]. The fluctuation spectra collected using this method can be directly related back to correlation functions derived from Hamiltonians [18, 19], yielding invaluable experimental insights for theoretical developments and deeper understandings of the underlying physics.
A Study of Generative Large Language Model for Medical Research and Healthcare
Peng, Cheng, Yang, Xi, Chen, Aokun, Smith, Kaleb E, PourNejatian, Nima, Costa, Anthony B, Martin, Cheryl, Flores, Mona G, Zhang, Ying, Magoc, Tanja, Lipori, Gloria, Mitchell, Duane A, Ospina, Naykky S, Ahmed, Mustafa M, Hogan, William R, Shenkman, Elizabeth A, Guo, Yi, Bian, Jiang, Wu, Yonghui
There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.
Capturing dynamical correlations using implicit neural representations
Chitturi, Sathya, Ji, Zhurun, Petsch, Alexander, Peng, Cheng, Chen, Zhantao, Plumley, Rajan, Dunne, Mike, Mardanya, Sougata, Chowdhury, Sugata, Chen, Hongwei, Bansil, Arun, Feiguin, Adrian, Kolesnikov, Alexander, Prabhakaran, Dharmalingam, Hayden, Stephen, Ratner, Daniel, Jia, Chunjing, Nashed, Youssef, Turner, Joshua
The observation and description of collective excitations in solids is a fundamental issue when seeking to understand the physics of a many-body system. Analysis of these excitations is usually carried out by measuring the dynamical structure factor, S(Q, $\omega$), with inelastic neutron or x-ray scattering techniques and comparing this against a calculated dynamical model. Here, we develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data. We benchmark this approach on a Linear Spin Wave Theory (LSWT) simulator and advanced inelastic neutron scattering data from the square-lattice spin-1 antiferromagnet La$_2$NiO$_4$. We find that the model predicts the unknown parameters with excellent agreement relative to analytical fitting. In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data, without the need for human-guided peak finding and fitting algorithms. This prototypical approach promises a new technology for this field to automatically detect and refine more advanced models for ordered quantum systems.
Clinical Concept and Relation Extraction Using Prompt-based Machine Reading Comprehension
Peng, Cheng, Yang, Xi, Yu, Zehao, Bian, Jiang, Hogan, William R., Wu, Yonghui
Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using two benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. Results and Conclusion: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the two benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the two datasets by 1%~3% and 0.7%~1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%~2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the two datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications.
RSBNet: One-Shot Neural Architecture Search for A Backbone Network in Remote Sensing Image Recognition
Peng, Cheng, Li, Yangyang, Shang, Ronghua, Jiao, Licheng
Recently, a massive number of deep learning based approaches have been successfully applied to various remote sensing image (RSI) recognition tasks. However, most existing advances of deep learning methods in the RSI field heavily rely on the features extracted by the manually designed backbone network, which severely hinders the potential of deep learning models due the complexity of RSI and the limitation of prior knowledge. In this paper, we research a new design paradigm for the backbone architecture in RSI recognition tasks, including scene classification, land-cover classification and object detection. A novel one-shot architecture search framework based on weight-sharing strategy and evolutionary algorithm is proposed, called RSBNet, which consists of three stages: Firstly, a supernet constructed in a layer-wise search space is pretrained on a self-assembled large-scale RSI dataset based on an ensemble single-path training strategy. Next, the pre-trained supernet is equipped with different recognition heads through the switchable recognition module and respectively fine-tuned on the target dataset to obtain task-specific supernet. Finally, we search the optimal backbone architecture for different recognition tasks based on the evolutionary algorithm without any network training. Extensive experiments have been conducted on five benchmark datasets for different recognition tasks, the results show the effectiveness of the proposed search paradigm and demonstrate that the searched backbone is able to flexibly adapt different RSI recognition tasks and achieve impressive performance.
Context-Aware Interaction Network for Question Matching
Hu, Zhe, Fu, Zuohui, Yin, Yu, de Melo, Gerard, Peng, Cheng
Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentences. However, these cross-attention mechanisms focus on word-level links between the two inputs, neglecting the importance of contextual information. We propose a context-aware interaction network (COIN) to properly align two sequences and infer their semantic relationship. Specifically, each interaction block includes (1) a context-aware cross-attention mechanism to effectively integrate contextual information, and (2) a gate fusion layer to flexibly interpolate aligned representations. We apply multiple stacked interaction blocks to produce alignments at different levels and gradually refine the attention results. Experiments on two question matching datasets and detailed analyses confirm the effectiveness of our model.
A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis
Dai, Zehui, Peng, Cheng, Chen, Huajie, Ding, Yadong
(T)ACSA tasks, including aspect-category sentiment analysis (ACSA) and targeted aspect-category sentiment analysis (TACSA), aims at identifying sentiment polarity on predefined categories. Incremental learning on new categories is necessary for (T)ACSA real applications. Though current multi-task learning models achieve good performance in (T)ACSA tasks, they suffer from catastrophic forgetting problems in (T)ACSA incremental learning tasks. In this paper, to make multi-task learning feasible for incremental learning, we proposed Category Name Embedding network (CNE-net). We set both encoder and decoder shared among all categories to weaken the catastrophic forgetting problem. Besides the origin input sentence, we applied another input feature, i.e., category name, for task discrimination. Our model achieved state-of-the-art on two (T)ACSA benchmark datasets. Furthermore, we proposed a dataset for (T)ACSA incremental learning and achieved the best performance compared with other strong baselines.
Generative Tensor Network Classification Model for Supervised Machine Learning
Sun, Zheng-Zhi, Peng, Cheng, Liu, Ding, Ran, Shi-Ju, Su, Gang
Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance.
Machine Learning by Two-Dimensional Hierarchical Tensor Networks: A Quantum Information Theoretic Perspective on Deep Architectures
Liu, Ding, Ran, Shi-Ju, Wittek, Peter, Peng, Cheng, García, Raul Blázquez, Su, Gang, Lewenstein, Maciej
The resemblance between the methods used in studying quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and a high bond dimension. In this work, we train two-dimensional hierarchical TNs to solve image recognition problems, using a training algorithm derived from the multipartite entanglement renormalization ansatz (MERA). This approach overcomes scalability issues and implies novel mathematical connections among quantum many-body physics, quantum information theory, and machine learning. While keeping the TN unitary in the training phase, TN states can be defined, which optimally encodes each class of the images into a quantum many-body state. We study the quantum features of the TN states, including quantum entanglement and fidelity. We suggest these quantities could be novel properties that characterize the image classes, as well as the machine learning tasks. Our work could be further applied to identifying possible quantum properties of certain artificial intelligence methods.