Goto

Collaborating Authors

 Large Language Model


What to (not) expect from OpenAI's ChatGPT – TechTalks

#artificialintelligence

This article is part of our coverage of the latest in AI research. This week, OpenAI released ChatGPT, another fascinating large language model (LLM) based on its flagship GPT series. ChatGPT, which is available as a free demo at the time of this writing, is a model that has been specialized for conversational interactions. As with most things regarding LLMs, the release of ChatGPT was followed by controversy. Within hours, the new language model became a Twitter sensation, with users posting screenshots of ChatGPT's impressive achievements and disastrous failures. However, when looked at from the broad perspective of large language models, ChatGPT is a reflection of the short but rich history of the field, representing how far we have come in just a few years and what fundamental problems remain to be solved.


New powerful AI bot creates angst among users: Are robots ready to take our jobs?

FOX News

'The Five' co-hosts discuss new AI bot ChatGPT and the impact artificial intelligence will have on future jobs. Fox News' Jesse Watters offered reassurance Wednesday on "The Five" that a war against machines is not imminent and killer robots haven't taken over quite yet. A new artificial intelligence (AI) bot, ChatGPT, caused a stir on social media, writing essays, books, poems and even computer code upon request. "The Five" got in on the trend asking it to write a poem about the show. "They entertain and inform with their banter and charm and have viewers tune in day and night," the message read in part.


ChatGPT, artificial intelligence, and the future of education - Vox

#artificialintelligence

A few weeks ago, Wharton professor Ethan Mollick told his MBA students to play around with GPT, an artificial intelligence model, and see if the technology could write an essay based on one of the topics discussed in his course. The assignment was, admittedly, mostly a gimmick meant to illustrate the power of the technology. Still, the algorithmically generated essays -- although not perfect and a tad over-reliant on the passive voice -- were at least reasonable, Mollick recalled. They also passed another critical test: a screening by Turnitin, a popular anti-plagiarism software. AI, it seems, had suddenly gotten pretty good.


I asked ChatGPT to write a Harry Potter fan fiction, the result will blow your mind.

#artificialintelligence

As a Harry Potter fan and a lover of writing, I was curious to see what would happen if I asked ChatGPT (Generative Pretrained Transformer) to write a Harry Potter fan fiction. So, I fed ChatGPT a few prompts and let it do its magic. The result was a piece of fan fiction titled "The Lost Diadem of Ravenclaw", which follows the story of Harry, Ron, and Hermione as they embark on a quest to find the lost diadem of Ravenclaw. The diadem, which is said to enhance the intelligence of its wearer, has been missing for centuries and is believed to be hidden in the Forbidden Forest. As they journey through the forest, the trio encounters a number of obstacles and challenges, including an encounter with a pack of werewolves and a showdown with the infamous Death Eater Bellatrix Lestrange. Despite the challenges they face, Harry, Ron, and Hermione persevere and eventually find the lost diadem.


OpenAI's Amazing ChatGPT: Is It Promising for Niche Topics?

#artificialintelligence

OpenAI has recently released their latest Artificial Intelligence (AI) chatbot prototype powered by a model from the GPT-3.5 series. It provides a service where you can ask questions and it comes back with a detailed answer in a conversational way. Almost as if you were talking to a human! ChatGPT is based on a trained model using Reinforcement Learning from Human Feedback which allows it to simulate conversation, answer follow-up questions and even admit to mistakes. Even though OpenAI's ChatGPT has recently taken the internet by storm, is it as good as it seems when it comes to dealing with a niche topic?


Mixed-effects transformers for hierarchical adaptation

arXiv.org Artificial Intelligence

Language differs dramatically from context to context. To some degree, large language models like GPT-3 account for such variation by conditioning on strings of initial input text, or prompts. However, prompting can be ineffective when contexts are sparse, out-of-sample, or extra-textual. In this paper, we introduce the mixed-effects transformer (MET), a novel approach for learning hierarchically-structured prefixes-- lightweight modules prepended to an input sequence-- to account for structured variation in language use. Specifically, we show how the popular class of mixedeffects regression models may be extended to transformer-based architectures using a regularized prefix-tuning procedure with dropout. Figure 1: In the mixed-effects transformer (MET), parameters We evaluate this approach on several domainadaptation of a pretrained transformer are frozen (solid benchmarks, finding that it learns border) while prefixes are adapted to different contextual contextual variation from minimal data while features (dashed border).


OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

arXiv.org Artificial Intelligence

Generalist models, which are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model, have been explored recently. Being, hopefully, an alternative to approaching general-purpose AI, existing generalist models are still at an early stage, where modality and task coverage is limited. To empower multi-modal task-scaling and speed up this line of research, we release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction. At the core of OFASys is the idea of decoupling multi-modal task representations from the underlying model implementations. In OFASys, a task involving multiple modalities can be defined declaratively even with just a single line of code. The system automatically generates task plans from such instructions for training and inference. It also facilitates multi-task training for diverse multi-modal workloads. As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data. The single OFA+ model achieves 95% performance in average with only 16% parameters of 15 task-finetuned models, showcasing the performance reliability of multi-modal task-scaling provided by OFASys. Available at https://github.com/OFA-Sys/OFASys


Structured Like a Language Model: Analysing AI as an Automated Subject

arXiv.org Artificial Intelligence

Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss their significance and risks, and outline our case for interpreting model design and outputs with support from psychoanalytic concepts. We trace a brief history of language models, culminating with the releases, in 2022, of systems that realise state-of-the-art natural language processing performance. We engage with one such system, OpenAI's InstructGPT, as a case study, detailing the layers of its construction and conducting exploratory and semi-structured interviews with chatbots. These interviews probe the model's moral imperatives to be helpful, truthful and harmless by design. The model acts, we argue, as the condensation of often competing social desires, articulated through the internet and harvested into training data, which must then be regulated and repressed. This foundational structure can however be redirected via prompting, so that the model comes to identify with, and transfer, its commitments to the immediate human subject before it. In turn, these automated productions of language can lead to the human subject projecting agency upon the model, effecting occasionally further forms of countertransference. We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems.


Explain to me like I am five -- Sentence Simplification Using Transformers

arXiv.org Artificial Intelligence

Sentence simplification aims at making the structure of text easier to read and understand while maintaining its original meaning. This can be helpful for people with disabilities, new language learners, or those with low literacy. Simplification often involves removing difficult words and rephrasing the sentence. Previous research have focused on tackling this task by either using external linguistic databases for simplification or by using control tokens for desired fine-tuning of sentences. However, in this paper we purely use pre-trained transformer models. We experiment with a combination of GPT-2 and BERT models, achieving the best SARI score of 46.80 on the Mechanical Turk dataset, which is significantly better than previous state-of-the-art results. The code can be found at https://github.com/amanbasu/sentence-simplification.


Implicit causality in GPT-2: a case study

arXiv.org Artificial Intelligence

This case study investigates the extent to which a language model (GPT-2) is able to capture native speakers' intuitions about implicit causality in a sentence completion task. We first reproduce earlier results (showing lower surprisal values for pronouns that are congruent with either the subject or object, depending on which one corresponds to the implicit causality bias of the verb), and then examine the effects of gender and verb frequency on model performance. Our second study examines the reasoning ability of GPT-2: is the model able to produce more sensible motivations for why the subject VERBed the object if the verbs have stronger causality biases? We also developed a methodology to avoid human raters being biased by obscenities and disfluencies generated by the model.