Goto

Collaborating Authors

 Large Language Model


Is GPT-3 a Good Data Annotator?

arXiv.org Artificial Intelligence

Evaluations show that GPT-3 has gained The democratization of artificial intelligence (AI) through pretraining a surprisingly wide range of (Garvey, 2018; Rubeis et al., 2022) aims to provide knowledge, which can be transferred to downstream access to AI technologies to all members of tasks through knowledge distillation (Kim society, including individuals, small-and mediumsized et al., 2022). We present some examples in Appendix enterprises (SMEs), academic research labs, A.12. Due to the model architecture and and nonprofit organizations. Achieving this goal is pretraining tasks designed for auto-regressive generation, crucial for the promotion of innovation, economic GPT-3 is capable of generating human-like growth, and fairness and equality. As typical AI text and performing a broad array of NLP tasks, models are usually data-hungry, one significant obstacle such as machine translation, summarization, and of AI democratization is the preparation of question-answering.


DOC: Improving Long Story Coherence With Detailed Outline Control

arXiv.org Artificial Intelligence

We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. DOC consists of two complementary components: a detailed outliner and a detailed controller. The detailed outliner creates a more detailed, hierarchically structured outline, shifting creative burden from the main drafting procedure to the planning stage. The detailed controller ensures the more detailed outline is still respected during generation by controlling story passages to align with outline details. In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22.5% absolute gain), outline relevance (28.2%), and interestingness (20.7%). Humans also judged DOC to be much more controllable in an interactive generation setting.


OpenAI reportedly warned Microsoft about rushing GPT-4 integration into Bing

Engadget

OpenAI warned Microsoft early this year about rushing the integration of GPT-4 into Bing without further training, according to The Wall Street Journal. Although Microsoft forged ahead anyway, the alert proved prescient as early users noticed "unhinged" behavior in the Bing AI tool. Rather than buying OpenAI outright, Microsoft invested in a 49-percent stake in the artificial intelligence startup, a strategy designed to help it avoid antitrust scrutiny. The arrangement gave Microsoft early access to OpenAI's ChatGPT and DALL-E 2 to boost its Bing search engine. In addition, it's adding OpenAI-powered CoPilot to Office and other software products as rival Google scrambles to catch up.



Homework will 'never be the same' says ChatGPT founder

The Japan Times

Artificial intelligence tools will revolutionize education like calculators did, but will not supplant learning, ChatGPT's founder Sam Altman told students in Tokyo on Monday, defending the new technology. "Probably take-home essays are never going to be quite the same again," the OpenAI chief said in remarks at Keio University. "We have a new tool in education. Sort of like a calculator for words," he said. "And the way we teach people is going to have to change and the way we evaluate students is going to have to change."


Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models

arXiv.org Artificial Intelligence

Recent advances in zero-shot learning have enabled the use of paired image-text data to replace structured labels, replacing the need for expert annotated datasets. Models such as CLIP-based CheXzero utilize these advancements in the domain of chest X-ray interpretation. We hypothesize that domain pre-trained models such as CXR-BERT, BlueBERT, and ClinicalBERT offer the potential to improve the performance of CLIP-like models with specific domain knowledge by replacing BERT weights at the cost of breaking the original model's alignment. We evaluate the performance of zero-shot classification models with domain-specific pre-training for detecting low-prevalence pathologies. Even though replacing the weights of the original CLIP-BERT degrades model performance on commonly found pathologies, we show that pre-trained text towers perform exceptionally better on low-prevalence diseases. This motivates future ensemble models with a combination of differently trained language models for maximal performance.


Assessing the Effectiveness of GPT-3 in Detecting False Political Statements: A Case Study on the LIAR Dataset

arXiv.org Artificial Intelligence

The detection of political fake statements is crucial for maintaining information integrity and preventing the spread of misinformation in society. Historically, state-of-the-art machine learning models employed various methods for detecting deceptive statements. These methods include the use of metadata (W. Wang et al., 2018), n-grams analysis (Singh et al., 2021), and linguistic (Wu et al., 2022) and stylometric (Islam et al., 2020) features. Recent advancements in large language models, such as GPT-3 (Brown et al., 2020) have achieved state-of-the-art performance on a wide range of tasks. In this study, we conducted experiments with GPT-3 on the LIAR dataset (W. Wang et al., 2018) and achieved higher accuracy than state-of-the-art models without using any additional meta or linguistic features. Additionally, we experimented with zero-shot learning using a carefully designed prompt and achieved near state-of-the-art performance. An advantage of this approach is that the model provided evidence for its decision, which adds transparency to the model's decision-making and offers a chance for users to verify the validity of the evidence provided.


Language models are not naysayers: An analysis of language models on negation benchmarks

arXiv.org Artificial Intelligence

Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundamental linguistic phenomenon that is central to language understanding. We evaluate different LLMs -- including the open-source GPT-neo, GPT-3, and InstructGPT -- against a wide range of negation benchmarks. Through systematic experimentation with varying model sizes and prompts, we show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.


Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training

arXiv.org Artificial Intelligence

The surge in multimodal AI's success has sparked concerns over data privacy in vision-and-language tasks. While CLIP has revolutionized multimodal learning through joint training on images and text, its potential to unintentionally disclose sensitive information necessitates the integration of privacy-preserving mechanisms. We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model that effectively addresses privacy concerns while retaining accuracy. Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets encompassing diverse vision-and-language tasks such as image classification and visual question answering. We demonstrate that our approach retains performance on par with the standard non-private CLIP model. Furthermore, we analyze our proposed algorithm under linear representation settings. We derive the convergence rate of our algorithm and show a trade-off between utility and privacy when gradients are clipped per-batch and the loss function does not satisfy smoothness conditions assumed in the literature for the analysis of DP-SGD.


INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation

arXiv.org Artificial Intelligence

We introduce a method that dramatically reduces fine-tuning VRAM requirements and rectifies quantization errors in quantized Large Language Models. First, we develop an extremely memory-efficient fine-tuning (EMEF) method for quantized models using Low-Rank Adaptation (LoRA), and drawing upon it, we construct an error-correcting algorithm designed to minimize errors induced by the quantization process. Our method reduces the memory requirements by up to 5.6 times, which enables fine-tuning a 7 billion parameter Large Language Model (LLM) on consumer laptops. At the same time, we propose a Low-Rank Error Correction (LREC) method that exploits the added LoRA layers to ameliorate the gap between the quantized model and its float point counterpart. Our error correction framework leads to a fully functional INT2 quantized LLM with the capacity to generate coherent English text. To the best of our knowledge, this is the first INT2 Large Language Model that has been able to reach such a performance. The overhead of our method is merely a 1.05 times increase in model size, which translates to an effective precision of INT2.1. Also, our method readily generalizes to other quantization standards, such as INT3, INT4, and INT8, restoring their lost performance, which marks a significant milestone in the field of model quantization. The strategies delineated in this paper hold promising implications for the future development and optimization of quantized models, marking a pivotal shift in the landscape of low-resource machine learning computations.