Goto

Collaborating Authors

 Large Language Model


Cluelessly Clueless AI

#artificialintelligence

Douglas Hofstadter, a cognitive scientist, recently wrote in the Economist that he believes that GPT-3 is "cluelessly clueless." By this he means that GPT-3 has no idea about what it is saying. To illustrate, he and a colleague asked it a few questions. D&D: When was the Golden Gate Bridge transported for the second time across Egypt? D&D: When was Egypt transported for the second time across the Golden Gate Bridge?


OpenAI's New AI Learned to Play Minecraft by Watching 70,000 Hours of YouTube

#artificialintelligence

In 2020, OpenAI's machine learning algorithm GPT-3 blew people away when, after ingesting billions of words scraped from the internet, it began spitting out well-crafted sentences. This year, DALL-E 2, a cousin of GPT-3 trained on text and images, caused a similar stir online when it began whipping up surreal images of astronauts riding horses and, more recently, crafting weird, photorealistic faces of people that don't exist. Now, the company says its latest AI has learned to play Minecraft after watching some 70,000 hours of video showing people playing the game on YouTube. Compared to numerous prior Minecraft algorithms which operate in much simpler "sandbox" versions of the game, the new AI plays in the same environment as humans, using standard keyboard-and-mouse commands. In a blog post and preprint detailing the work, the OpenAI team say that, out of the box, the algorithm learned basic skills, like chopping down trees, making planks, and building crafting tables.


Cerebras sets record for largest AI model on a single chip

#artificialintelligence

In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate. "Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes." The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.


Does this AI know it's alive?

#artificialintelligence

We don't have much reason to think that they have an internal monologue, the kind of sense perception humans have, or an awareness that they're a being in the world. Over the weekend, the Washington Post's Nitasha Tiku published a profile of Blake Lemoine, a software engineer assigned to work on the Language Model for Dialogue Applications (LaMDA) project at Google. LaMDA is a chatbot AI, and an example of what machine learning researchers call a "large language model," or even a "foundation model." It's similar to OpenAI's famous GPT-3 system, and has been trained on literally trillions of words compiled from online posts to recognize and reproduce patterns in human language. LaMDA is a really good large language model.



Word Embeddings: CBOW and Skip Gram

#artificialintelligence

Since the advent of transformers, NLP gained a lot of traction, and a wide variety of tasks are already solved by GPT-3 and other big transformers-based models. But today we are going to take a step back and learn about word embeddings. In this blog, we are primarily going to look into CBOW or Continuous Bag of Words and Skip Grams. These embeddings are super important for the conversion of text into numbers. So, without further ado, let's dive into the basics of NLP.


RadBERT: Adapting Transformer-based Language Models to Radiology

#artificialintelligence

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. To investigate if tailoring a transformer-based language model to radiology is beneficial for radiology natural language processing (NLP) applications. This retrospective study presents RadBERT, a family of bidirectional encoder representations from transformers-based language models adapted for radiology.


Machine Learning With Google Cloud - AI Summary

#artificialintelligence

From their DeepMind project beating champions of Alpha Go at their own game, to recent announcements Magneta and Springboard, not to mention driverless cars, its clear that AI and Machine Learning are central to Google's strategy across its vast portfolio. In a recent interview with Hollywood Reporter, Alphabet chairman Eric Schmidt played down the fears that surround advancements in AI: 'To be clear, we're not talking about consciousness, we're not talking about souls, we're not talking about independent creativity." However, being acutely aware of the concerns around intelligent technology, the company's AI research division Google Brain recently published an AI Precision Safety whitepaper. Powerful Infrastructure Underpinning all of these projects, as well as the company's flagship Search, Translate and Youtube products is Google Cloud Platform, providing developers with the tools to build a range of programs from simple websites to complex, intelligent applications. As part of our AI in Business Festival, we spoke to Miles Ward, Global Head of Solutions at Google Cloud Platform, to find out more about the machine learning tools they offer to developers. From their DeepMind project beating champions of Alpha Go at their own game, to recent announcements Magneta and Springboard, not to mention driverless cars, its clear that AI and Machine Learning are central to Google's strategy across its vast portfolio. In a recent interview with Hollywood Reporter, Alphabet chairman Eric Schmidt played down the fears that surround advancements in AI: 'To be clear, we're not talking about consciousness, we're not talking about souls, we're not talking about independent creativity."


Google proposes new method to derive analytical expressions for terms in quantum mechanics…

#artificialintelligence

It's no news that the giant Alphabet invests quite a lot in ML applications to science, through channels such as Google Research and Deepmind. While in the fields of chemistry and biology AlphaFold is by far its most famous project, Deepmind has also gone into quantum mechanical (QM) calculations (my blog entry here), and so is doing Google Research. QM calculations are very important in chemistry, as they provide the highest level of detail about electron densities, distributions, and spin states in molecules and materials, all the key elements required to model, understand, and predict their chemical reactivity and physicochemical properties -none of which are approachable with classical methods. The new work I comment on here comes from Google Research and also addresses ways to improve QM calculations. Specifically, Ma et al developed a new method to derive symbolic, analytical forms of DFT functionals.


Text Generation using GPT-J with Hugging Face 🤗 and Segmind

#artificialintelligence

Text generation is the task of automatically generating text using a machine learning system. A good text generation system can make it really hard to distinguish between human and machine-written text pieces.