Large Language Model
No, That's Wrong: Google's Bard AI Demo Spouts Incorrect Info
As Google's AI-powered Bard prepares to compete against ChatGPT, don't count on the chatbot programs always being right: A recent demo of Bard shows it spouting inaccurate information. Bard, which Google announced on Monday, is slated to arrive in the coming weeks. To promote the AI program, the company posted a GIF on social media that shows Bard answering a question about what new discoveries NASA's James Webb Space Telescope has made. The program lists three discoveries the space telescope made in an easy-to-read, bulleted format. Hence, through Bard, a user can quickly learn information, without having to scroll through a long list of search results to find the applicable site.
Which Sectors Are Working With OpenAI?
While OpenAI has really risen to fame with the release of ChatGPT in November 2022, the U.S.-based artificial intelligence research and deployment company is about much more than its popular AI-powered chatbot. In fact, as Statista's Felix Richter reports below, OpenAI's technology is already being used by hundreds of companies around the world. According to data published by the enterprise software platform Enterprise Apps Today, companies in the technology and education sectors are most likely to take advantage of OpenAI's solutions, while business services, manufacturing and finance are also high on the list of industries utilizing artificial intelligence in their business processes. Broadly defined as "the theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages" artificial intelligence (AI) can now be found in various applications, including for example web search, natural language translation, recommendation systems, voice recognition and autonomous driving. In healthcare, AI can help synthesize large volumes of clinical data to gain a holistic view of the patient, but it's also used in robotics for surgery, nursing, rehabilitation and orthopedics.
Guide to ChatGPT - by Alex McFarland - AI Disruption
So this week, I wanted to give you guys something a little different. The weekly AI Disruption will continue next week. I hope this guide proves useful! One of the hottest topics in the field of AI right now is ChatGPT, short for chat-based Generative Pre-trained Transformer. This powerful tool is being used across industries and for many use cases.
10 Easy Ways to Fix Chat GPT Not Working - TechPP
ChatGPT has taken the Internet by storm. People all over the world use ChatGPT to generate ideas for content, essays, emails, and codes and solve mathematical questions. The AI chatbot has surpassed 100 million users in less than two months making it the fastest growing consumer internet app ever. It's normal for any app or website to struggle with such a large user base. We, too, often struggled with problems when using ChatGPT.
Flexible, Model-Agnostic Method for Materials Data Extraction from Text Using General Purpose Language Models
Polak, Maciej P., Modi, Shrey, Latosinska, Anna, Zhang, Jinming, Wang, Ching-Wen, Wang, Shanonan, Hazra, Ayan Deep, Morgan, Dane
Accurate and comprehensive material databases extracted from research papers are critical for materials science and engineering but require significant human effort to develop. In this paper we present a simple method of extracting materials data from full texts of research papers suitable for quickly developing modest-sized databases. The method requires minimal to no coding, prior knowledge about the extracted property, or model training, and provides high recall and almost perfect precision in the resultant database. The method is fully automated except for one human-assisted step, which typically requires just a few hours of human labor. The method builds on top of natural language processing and large general language models but can work with almost any such model. The language models GPT-3/3.5, bart and DeBERTaV3 are evaluated here for comparison. We provide a detailed detailed analysis of the methods performance in extracting bulk modulus data, obtaining up to 90% precision at 96% recall, depending on the amount of human effort involved. We then demonstrate the methods broader effectiveness by developing a database of critical cooling rates for metallic glasses.
In-Context Learning with Many Demonstration Examples
Li, Mukai, Gong, Shansan, Feng, Jiangtao, Xu, Yiheng, Zhang, Jun, Wu, Zhiyong, Kong, Lingpeng
Large pre-training language models (PLMs) have shown promising in-context learning abilities. However, due to the backbone transformer architecture, existing PLMs are bottlenecked by the memory and computational cost when scaling up to a large context size, leaving instruction tuning and in-context learning of many demonstration examples, as well as long-range language modeling under-explored. In this study, we propose a long-range language model EVALM based on an efficient transformer mechanism. EVALM is trained with 8k tokens per batch line and can test up to 256k-lengthed contexts with extrapolation, 128 times to the limit of existing PLMs (e.g. GPT3). Based on EVALM, we scale up the size of examples efficiently in both instruction tuning and in-context learning to explore the boundary of the benefits from more annotated data. Experimental results on a diverse set of tasks show that EVALM achieves 4.1% higher accuracy on average, and the average length of achieving the best accuracy score over tasks is around 12k. We find that in-context learning can achieve higher performance with more demonstrations under many-shot instruction tuning (8k), and further extending the length of instructions (16k) can further improve the upper bound of scaling in-context learning.
Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?
Wang, Shuai, Scells, Harrisen, Koopman, Bevan, Zuccon, Guido
Systematic reviews are comprehensive reviews of the literature for a highly focused research question. These reviews are often treated as the highest form of evidence in evidence-based medicine, and are the key strategy to answer research questions in the medical field. To create a high-quality systematic review, complex Boolean queries are often constructed to retrieve studies for the review topic. However, it often takes a long time for systematic review researchers to construct a high quality systematic review Boolean query, and often the resulting queries are far from effective. Poor queries may lead to biased or invalid reviews, because they missed to retrieve key evidence, or to extensive increase in review costs, because they retrieved too many irrelevant studies. Recent advances in Transformer-based generative models have shown great potential to effectively follow instructions from users and generate answers based on the instructions being made. In this paper, we investigate the effectiveness of the latest of such models, ChatGPT, in generating effective Boolean queries for systematic review literature search. Through a number of extensive experiments on standard test collections for the task, we find that ChatGPT is capable of generating queries that lead to high search precision, although trading-off this for recall. Overall, our study demonstrates the potential of ChatGPT in generating effective Boolean queries for systematic review literature search. The ability of ChatGPT to follow complex instructions and generate queries with high precision makes it a valuable tool for researchers conducting systematic reviews, particularly for rapid reviews where time is a constraint and often trading-off higher precision for lower recall is acceptable.
Offsite-Tuning: Transfer Learning without Full Model
Xiao, Guangxuan, Lin, Ji, Han, Song
Transfer learning is important for foundation models to adapt to downstream tasks. However, many foundation models are proprietary, so users must share their data with model owners to fine-tune the models, which is costly and raise privacy concerns. Moreover, fine-tuning large foundation models is computation-intensive and impractical for most downstream users. In this paper, we propose Offsite-Tuning, a privacy-preserving and efficient transfer learning framework that can adapt billion-parameter foundation models to downstream data without access to the full model. In offsite-tuning, the model owner sends a light-weight adapter and a lossy compressed emulator to the data owner, who then fine-tunes the adapter on the downstream data with the emulator's assistance. The fine-tuned adapter is then returned to the model owner, who plugs it into the full model to create an adapted foundation model. Offsite-tuning preserves both parties' privacy and is computationally more efficient than the existing fine-tuning methods that require access to the full model weights. We demonstrate the effectiveness of offsite-tuning on various large language and vision foundation models. Offsite-tuning can achieve comparable accuracy as full model fine-tuning while being privacy-preserving and efficient, achieving 6.5x speedup and 5.6x memory reduction. Code is available at https://github.com/mit-han-lab/offsite-tuning.
Global Constraints with Prompting for Zero-Shot Event Argument Classification
Lin, Zizheng, Zhang, Hongming, Song, Yangqiu
Determining the role of event arguments is a crucial subtask of event extraction. Most previous supervised models leverage costly annotations, which is not practical for open-domain applications. In this work, we propose to use global constraints with prompting to effectively tackles event argument classification without any annotation and task-specific training. Specifically, given an event and its associated passage, the model first creates several new passages by prefix prompts and cloze prompts, where prefix prompts indicate event type and trigger span, and cloze prompts connect each candidate role with the target argument span. Then, a pre-trained language model scores the new passages, making the initial prediction. Our novel prompt templates can easily adapt to all events and argument types without manual effort. Next, the model regularizes the prediction by global constraints exploiting cross-task, cross-argument, and cross-event relations. Extensive experiments demonstrate our model's effectiveness: it outperforms the best zero-shot baselines by 12.5% and 10.9% F1 on ACE and ERE with given argument spans and by 4.3% and 3.3% F1, respectively, without given argument spans. We have made our code publicly available.
NLP-based Decision Support System for Examination of Eligibility Criteria from Securities Prospectuses at the German Central Bank
Hänig, Christian, Schlösser, Markus, Hamotskyi, Serhii, Zambaku, Gent, Blankenburg, Janek
As part of its digitization initiative, the German Central Bank (Deutsche Bundesbank) wants to examine the extent to which natural Language Processing (NLP) can be used to make independent decisions upon the eligibility criteria of securities prospectuses. Every month, the Directorate General Markets at the German Central Bank receives hundreds of scanned prospectuses in PDF format, which must be manually processed to decide upon their eligibility. We found that this tedious and time-consuming process can be (semi-)automated by employing modern NLP model architectures, which learn the linguistic feature representation in text to identify the present eligible and ineligible criteria. The proposed Decision Support System provides decisions of document-level eligibility criteria accompanied by human-understandable explanations of the decisions. The aim of this project is to model the described use case and to evaluate the extent to which current research results from the field of NLP can be applied to this problem. After creating a heterogeneous domain-specific dataset containing annotations of eligible and non-eligible mentions of relevant criteria, we were able to successfully build, train and deploy a semi-automatic decider model. This model is based on transformer-based language models and decision trees, which integrate the established rule-based parts of the decision processes. Results suggest that it is possible to efficiently model the problem and automate decision making to more than 90% for many of the considered eligibility criteria.