Large Language Model
Writing With Artificial Intelligence With Andrew Mayne
What is GPT-3 and how can writers use it responsibly as part of their creative process? How can we approach AI tools with curiosity, rather than fear? In the intro, I mention the discussion about whether Google's language model, LaMDA, could be sentient [The Verge]; and the Alliance of Independent Authors Ethical Usage of AI tools. If you'd like to know more about using AI for writing, images, marketing, voice, translation, and more, check out my course, The AI-Assisted Author. Andrew Mayne is the multi-award-nominated and internationally best-selling author of thrillers. He invented an underwater stealth suit for shark diving, and he works with OpenAI as a science communicator. He also has books for authors, including, 'How to Write a Novella in 24 hours,' and a co-hosts the podcast'Weird Things.' You can find Andrew at www.AndrewMayne.com You can find GPT-3 on OpenAI.com. There are many tools built on top of GPT-3. I use and recommend Sudowrite for fiction, in particular. Joanna: Andrew Mayne is the multi-award-nominated and internationally best-selling author of thrillers. He invented an underwater stealth suit for shark diving, and he works with OpenAI as a science communicator. He also has books for authors, including, 'How to Write a Novella in 24 hours,' and a co-hosts the podcast'Weird Things.' Andrew: Hey, thank you for having me. Joanna: Oh, you do so many things. But we are actually going to talk about AI today. Andrew: Well, ever since I was a little boy, I was really interested in science, and entertainment, and everything in between. And I loved robots when I was a kid. And I'd build robots from science fairs and stuff, and I would use coffee cans, and little motors and things I pulled from toys to do that.
RealTime QA: What's the Answer Right Now?
Kasai, Jungo, Sakaguchi, Keisuke, Takahashi, Yoichi, Bras, Ronan Le, Asai, Akari, Yu, Xinyan, Radev, Dragomir, Smith, Noah A., Choi, Yejin, Inui, Kentaro
We introduce RealTime QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). RealTime QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open domain QA datasets and pursues, instantaneous applications. We build strong baseline models upon large pretrained language models, including GPT-3 and T5. Our benchmark is an ongoing effort, and this preliminary report presents real-time evaluation results over the past month. Our experimental results show that GPT-3 can often properly update its generation results, based on newly-retrieved documents, highlighting the importance of up-to-date information retrieval. Nonetheless, we find that GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer. This suggests an important avenue for future research: can an open domain QA system identify such unanswerable cases and communicate with the user or even the retrieval module to modify the retrieval results? We hope that RealTime QA will spur progress in instantaneous applications of question answering and beyond.
Genshin Impact! Fine-tuning CLIP for anime search
Today let's build a search-anime system. We will use text as our query and get images as result. For this we would usually need to manually annotate the image with some tags, often referred to as TBIR (Text/Tag-based Image Retrieval). And for this example, we will use OpenAI CLIP. CLIP is a powerful embedding model that outputs the similarity between text and images.
Sourceless presents the first Cognitive Web
Formwelt, OpenAI Codex, Github Co-Pilot and other Artificial Intelligence projects will make the SourceLess Platform usable by absolutely anyone, being able to create anything just by using words (written or spoken). For example, by using the Formwelt language, anyone, regardless of nationality, can communicate in a direct and semantically correct way with OpenAI Codex and create anything in the digital world; you can create a complete and complex website in less than an hour. All these AI systems will be implemented inside the SourceLess Platform, thus everyone can have access to all the facilities of the new Web through a single domain (eg: str.domain). Education, Technology & Innovation -- these three pillars of the future are the foundations of the SourceLess Platform. The purpose of education in the Sourceless project is to transmit knowledge or foster skills and character traits. These aims may include the development of understanding, rationality, kindness, and honesty.
Working together with YouTube
Helping enrich people's lives with our research, we've partnered with businesses across Alphabet to apply our technology towards improving the products and services used by billions of people every day. One of our key partners is YouTube, who are on a mission to give everyone a voice and show them the world. Working together with YouTube's product and engineering teams, we've helped optimise the decision-making processes that increase safety, decrease latency, and enhance the viewer, creator, and advertiser experience for all. With video surging during the COVID-19 pandemic, and the total amount of internet traffic expected to grow in the future, video compression is an increasingly important problem. Working together with YouTube, we explored the potential for our AI model, MuZero, to improve the VP9 codec, a coding format that helps compress and transmit video over the internet.
Natural Language Processing: The Technology That's Biased
Natural Language Processing (NLP) refers to building machines that can understand and respond to voice data with their own text and speech. Natural Language Processing falls under the umbrella of Artificial Intelligence (AI) and recent models like the Bidirectional Encoder Representations from Transformers (BERT), Generative Pre-Trained Transformer 3 (GPT-3) and Pathways AI Language Models (PaLM) have made accurate human-machine communication possible. These Large language Models (LLMs) are trained on massive volumes of text with billions of parameters and are able to understand and answer reading comprehension questions as well as generating new text such as a summary. Put simply, LLMs are trained to predict the next words in a sentence, such as by extending the autocomplete feature in messaging applications. But they can do much more, for example question answering, translation, image captioning, human-level dialogue agents, entity linking, or even data cleaning (for mixes of structured and unstructured data). NLP is already being used to automate some human tasks (RPA – robotic process automation), however the breath-taking advances in the last 3 years, NLP open new potential for businesses to digitize company knowledge and disrupting incumbent business models.
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Shah, Dhruv, Osinski, Blazej, Ichter, Brian, Levine, Sergey
Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings. However, particularly in vision-based settings where specifying goals requires an image, this makes for an unnatural interface. Language provides a more convenient modality for communication with robots, but contemporary methods typically require expensive supervision, in the form of trajectories annotated with language descriptions. We present a system, LM-Nav, for robotic navigation that enjoys the benefits of training on unannotated large datasets of trajectories, while still providing a high-level interface to the user. Instead of utilizing a labeled instruction following dataset, we show that such a system can be constructed entirely out of pre-trained models for navigation (ViNG), image-language association (CLIP), and language modeling (GPT-3), without requiring any fine-tuning or language-annotated robot data. We instantiate LM-Nav on a real-world mobile robot and demonstrate long-horizon navigation through complex, outdoor environments from natural language instructions. For videos of our experiments, code release, and an interactive Colab notebook that runs in your browser, please check out our project page https://sites.google.com/view/lmnav
On Artificial General Intelligence, AI Sentience, And Large Language Models
Many forms of intelligence exist. Octopuses are highly intelligent--and completely unlike humans. In case you haven't noticed, artificial intelligence systems have been behaving in increasingly astonishing ways lately. OpenAI's new model DALL-E 2, for instance, can produce captivating original images based on simple text prompts. Models like DALL-E are making it harder to dismiss the notion that AI is capable of creativity. Consider, for instance, DALL-E's imaginative rendition of "a hip-hop cow in a denim jacket recording a hit single in the studio."
Dive into Big Model Training
The increasing scale of model size and continuous improvement of performance herald the arrival of the Big Model era. In this report, we explore what and how the big model training works by diving into training objectives and training methodologies. Specifically,training objectives describe how to leverage web-scale data to develop extremely capable and incredibly large models based on self-supervised learning, and training methodologies which are based on distributed training describe how to make big model training a reality. We summarize the existing training methodologies into three main categories: training parallelism, memory-saving technologies, and model sparsity design. Training parallelism can be categorized into data, pipeline, and tensor parallelism according to the dimension of parallelism that takes place. Memory-saving technologies are orthogonal and complementary to training parallelism. And model sparsity design further scales up the model size with a constant computational cost. A continuously updated paper list of big model training is provided at https://github.com/qhliu26/BM-Training.
A Hazard Analysis Framework for Code Synthesis Large Language Models
Khlaaf, Heidy, Mishkin, Pamela, Achiam, Joshua, Krueger, Gretchen, Brundage, Miles
Codex, a large language model (LLM) trained on a variety of codebases, exceeds the previous state of the art in its capacity to synthesize and generate code. Although Codex provides a plethora of benefits, models that may generate code on such scale have significant limitations, alignment problems, the potential to be misused, and the possibility to increase the rate of progress in technical fields that may themselves have destabilizing impacts or have misuse potential. Yet such safety impacts are not yet known or remain to be explored. In this paper, we outline a hazard analysis framework constructed at OpenAI to uncover hazards or safety risks that the deployment of models like Codex may impose technically, socially, politically, and economically. The analysis is informed by a novel evaluation framework that determines the capacity of advanced code generation techniques against the complexity and expressivity of specification prompts, and their capability to understand and execute them relative to human ability.