Large Language Model
Explaining Patterns in Data with Language Models via Interpretable Autoprompting
Singh, Chandan, Morris, John X., Aneja, Jyoti, Rush, Alexander M., Gao, Jianfeng
Large language models (LLMs) have displayed an impressive ability to harness natural language to perform complex tasks. In this work, we explore whether we can leverage this learned ability to find and explain patterns in data. Specifically, given a pre-trained LLM and data examples, we introduce interpretable autoprompting (iPrompt), an algorithm that generates a natural-language string explaining the data. iPrompt iteratively alternates between generating explanations with an LLM and reranking them based on their performance when used as a prompt. Experiments on a wide range of datasets, from synthetic mathematics to natural-language understanding, show that iPrompt can yield meaningful insights by accurately finding groundtruth dataset descriptions. Moreover, the prompts produced by iPrompt are simultaneously human-interpretable and highly effective for generalization: on real-world sentiment classification datasets, iPrompt produces prompts that match or even improve upon human-written prompts for GPT-3. Finally, experiments with an fMRI dataset show the potential for iPrompt to aid in scientific discovery. All code for using the methods and data here is made available on Github.
Variational Latent-State GPT for Semi-Supervised Task-Oriented Dialog Systems
Liu, Hong, Cai, Yucheng, Lin, Zhenru, Ou, Zhijian, Huang, Yi, Feng, Junlan
Recently, two approaches, fine-tuning large pre-trained language models and variational training, have attracted significant interests, separately, for semi-supervised end-to-end task-oriented dialog (TOD) systems. In this paper, we propose Variational Latent-State GPT model (VLS-GPT), which is the first to combine the strengths of the two approaches. Among many options of models, we propose the generative model and the inference model for variational learning of the end-to-end TOD system, both as auto-regressive language models based on GPT-2, which can be further trained over a mix of labeled and unlabeled dialog data in a semi-supervised manner. Variational training of VLS-GPT is both statistically and computationally more challenging than previous variational learning works for sequential latent variable models, which use turn-level first-order Markovian. The inference model in VLS-GPT is non-Markovian due to the use of the Transformer architecture. In this work, we establish Recursive Monte Carlo Approximation (RMCA) to the variational objective with non-Markovian inference model and prove its unbiasedness. Further, we develop the computational strategy of sampling-then-forward-computation to realize RMCA, which successfully overcomes the memory explosion issue of using GPT in variational learning and speeds up training. Semi-supervised TOD experiments are conducted on two benchmark multi-domain datasets of different languages - MultiWOZ2.1 and CrossWOZ. VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised self-training baselines.
Review of Natural Language Processing in Pharmacology
Trajanov, Dimitar, Trajkovski, Vangel, Dimitrieva, Makedonka, Dobreva, Jovana, Jovanovik, Milos, Klemen, Matej, ลฝagar, Aleลก, Robnik-ล ikonja, Marko
Natural language processing (NLP) is an area of artificial intelligence that applies information technologies to process the human language, understand it to a certain degree, and use it in various applications. This area has rapidly developed in the last few years and now employs modern variants of deep neural networks to extract relevant patterns from large text corpora. The main objective of this work is to survey the recent use of NLP in the field of pharmacology. As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology. It has been used extensively, from intelligent searches through thousands of medical documents to finding traces of adversarial drug interactions in social media. We split our coverage into five categories to survey modern NLP methodology, commonly addressed tasks, relevant textual data, knowledge bases, and useful programming libraries. We split each of the five categories into appropriate subcategories, describe their main properties and ideas, and summarize them in a tabular form. The resulting survey presents a comprehensive overview of the area, useful to practitioners and interested observers.
Task formulation for Extracting Social Determinants of Health from Clinical Narratives
Torii, Manabu, Finn, Ian M., Doan, Son, Wang, Paul, Yang, Elly W., Zisook, Daniel S.
Objective: The 2022 n2c2 NLP Challenge posed identification of social determinants of health (SDOH) in clinical narratives. We present three systems that we developed for the Challenge and discuss the distinctive task formulation used in each of the three systems. Materials and Methods: The first system identifies target pieces of information independently using machine learning classifiers. The second system uses a large language model (LLM) to extract complete structured outputs per document. The third system extracts candidate phrases using machine learning and identifies target relations with hand-crafted rules. Results: The three systems achieved F1 scores of 0.884, 0.831, and 0.663 in the Subtask A of the Challenge, which are ranked third, seventh, and eighth among the 15 participating teams. The review of the extraction results from our systems reveals characteristics of each approach and those of the SODH extraction task. Discussion: Phrases and relations annotated in the task is unique and diverse, not conforming to the conventional event extraction task. These annotations are difficult to model with limited training data. The system that extracts information independently, ignoring the annotated relations, achieves the highest F1 score. Meanwhile, LLM with its versatile capability achieves the high F1 score, while respecting the annotated relations. The rule-based system tackling relation extraction obtains the low F1 score, while it is the most explainable approach. Conclusion: The F1 scores of the three systems vary in this challenge setting, but each approach has advantages and disadvantages in a practical application. The selection of the approach depends not only on the F1 score but also on the requirements in the application.
Medical AIs are advancing - when will they be in a clinic near you?
HOW would you feel if your doctor, rather than consult their own clinical knowledge, turned instead to an AI trained on your medical history to help diagnose your next ailment or write your next prescription? These sorts of scenarios have been hypothetical for decades โ the technology has been subpar and the stakes too high to risk offloading medical advice to a machine. However, the success of large language models like ChatGPT, a popular, artificially intelligent chatbot from the OpenAI research lab, has led to a rethink of what might be possible.
ChatGPT can find and fix bugs in computer code
ChatGPT, the AI chatbot developed by tech company OpenAI, can find and fix bugs in computer code as well as standard machine learning approaches โ and does even better when engaged in conversation. Dominik Sobania at Johannes Gutenberg University in Mainz, Germany, and his colleagues sought to see how well ChatGPT compared with other AI-powered coding support tools. A number of tools exist that use artificial intelligence to check programming code to ensure there are no mistakes. "ChatGPT came out and we thought it seems โฆ
ChatGPT could be a game-changer for marketers, but it won't replace humans any time soon
The recent release of the ChatGPT chatbot in November 2022 has generated significant public interest. In essence, ChatGPT is an AI-powered chatbot allowing users to simulate human-like conversations with an AI. GPT stands for Generative Pre-trained Transformer, a language processing model developed by the American artificial intelligence company OpenAI. The GPT language model uses deep learning to produce human-like responses. Deep learning is a branch of machine learning that involves training artificial neural networks to mimic the complexity of the human brain, to produce human-like responses. ChatGPT has a user-friendly interface that utilizes this technology, allowing users to interact with it in a conversational manner.
Microsoft Surface sales are tanking, Microsoft says
With soaring cloud revenues, plunging Windows and device revenues, and a few days into a substantial layoff, Microsoft's first-quarter results feel a bit like a quote from Dickens. The best of times: "The next major wave of computing is being born," as Microsoft reported 31 percent revenue growth in its Intelligent Cloud business, a day after Microsoft invested again in OpenAI and its chat service, ChatGPT. The worst of times: Windows OEM revenue sank 39 percent, thanks to a tanking PC market; Microsoft's Devices (Surface) revenue fell the same amount, thanks to issues launching products, reduced demand, and success a year ago. In the end, it all sort of came out in the wash, however, with net income down 12 percent to $16.4 billion and revenue sinking 2 percent to $52.7 billion. Microsoft reported $14.2 billion in revenue in More Personal Computing, its consumer business, down 19 percent, but 18 percent growth to $21.5 billion in Intelligent Cloud and 7 percent growth in Productivity and Business Processes, Microsoft's Office business.
Explaining Large Language Model-Based Neural Semantic Parsers (Student Abstract)
Rai, Daking, Zhou, Yilun, Wang, Bailin, Yao, Ziyu
While large language models (LLMs) have demonstrated strong capability in structured prediction tasks such as semantic parsing, few amounts of research have explored the underlying mechanisms of their success. Our work studies different methods for explaining an LLM-based semantic parser and qualitatively discusses the explained model behaviors, hoping to inspire future research toward better understanding them.