Large Language Model
OpenAI Finds Machine Learning Efficiency Is Outpacing Moore's Law
Eight years ago a machine learning algorithm learned to identify a cat--and it stunned the world. A few years later AI could accurately translate languages and take down world champion Go players. Now, machine learning has begun to excel at complex multiplayer video games like Starcraft and Dota 2 and subtle games like poker. AI, it would appear, is improving fast. But how fast is fast, and what's driving the pace?
Pre-Training A Neural Language Model Improves the Sample Efficiency of an Emergency Room Classification Model
Xu, Binbin (University of Bordeaux ) | Gil-Jardinรฉ, Cรฉdric (University Hospital of Bordeaux) | Thiessard, Frantz ( Universitรฉ de Bordeaux ) | Tellier, Eric (University Hospital of Bordeaux) | Avalos-Fernandez, Marta (Universitรฉ de Bordeaux) | Lagarde, Emmanuel (Universitรฉ de Bordeaux)
To build a French national electronic injury surveillance system based on emergency room visits, we aim to develop a coding system to classify their causes from clinical notes in free-text. Supervised learning techniques have shown good results in this area but require a large amount of expert annotated dataset which is time consuming and costly to obtain. We hypothesize that the Natural Language Processing Transformer model incorporating a generative self-supervised pre-training step can significantly reduce the required number of annotated samples for supervised fine-tuning. In this preliminary study, we test our hypothesis in the simplified problem of predicting whether a visit is the consequence of a traumatic event or not from free-text clinical notes. Using fully re-trained GPT-2 models (without OpenAI pre-trained weights), we assess the gain of applying a self-supervised pre-training phase with unlabeled notes prior to the supervised learning task. Results show that the number of data required to achieve a ginve level of performance (AUC>0.95) was reduced by a factor of 10 when applying pre-training. Namely, for 16 times more data, the fully-supervised model achieved an improvement <1% in AUC. To conclude, it is possible to adapt a multi-purpose neural language model such as the GPT-2 to create a powerful tool for classification of free-text notes with only a small number of labeled samples.
EmpTransfo: A Multi-Head Transformer Architecture for Creating Empathetic Dialog Systems
Zandie, Rohola (University of Denver ) | Mahoor, Mohammad H. (University of Denver)
Understanding emotions and responding accordingly is one of the biggest challenges of dialog systems. In this paper, we present EmpTransfo, a multi-head Transformer architecture for creating an empathetic dialog system. We show that utilizing the history of emotions and other metadata can improve the quality of generated conversations by the dialog system. EmpTransfo utilizes state-of-the-art pre-trained models (e.g., OpenAI-GPT) for language generation, though models with different sizes can be used. Our experimental results using a challenging language corpus show that the proposed approach outperforms other models in terms of Hit@1 and PPL.
'Dangerous' AI generates words that don't exist
A new AI has been created to generate words that do not exist. The one-shot website develops new, artificially-generated definitions for the non-existent words. ThisWordDoesNotExist.com generates new words such as "wacamole" (a single serving of waffle batter made with a sweet cornmeal mixture), "pileset" (form a mass of, or make a shape about, something), or "prayman" (the principal or leading men in a society or enterprise). Users click a button on the site, and a new word is made. The website was developed by San Francisco-based developer Thomas Dimson, an engineer who used to work for the Facebook-owned Instagram developing its recommendations algorithm.
IBM claims its Neural Computer achieves record AI model training time
In a technical paper quietly released earlier this year, IBM detailed what it calls the IBM Neural Computer, a custom-designed, reconfigurable parallel processing system designed to research and develop emerging AI algorithms and computational neuroscience. This week, the company published a preprint describing the first application demonstrated on the Neural Computer: a deep "neuroevolution" system that combines the hardware implementation of an Atari 2600, image preprocessing, and AI algorithms in an optimized pipeline. The coauthors report results competitive with state-of-the-art techniques, but perhaps more significantly, they claim that the system achieves a record training time of 1.2 million image frames per second. The Neural Computer represents something of a shot across the bow in the AI computational arms race. According to an analysis recently released by OpenAI, from 2012 to 2018, the amount of compute used in the largest AI training runs grew more than 300,000 times with a 3.5-month doubling time, far exceeding the pace of Moore's law. Video games are a well-established platform for AI and machine learning research.
Specification gaming: the flip side of AI ingenuity
At first sight, these kinds of examples may seem amusing but less interesting, and irrelevant to deploying agents in the real world, where there are no simulator bugs. However, the underlying problem isn't the bug itself but a failure of abstraction that can be exploited by the agent. In the example above, the robot's task was misspecified because of incorrect assumptions about simulator physics. Analogously, a real-world traffic optimisation task might be misspecified by incorrectly assuming that the traffic routing infrastructure does not have software bugs or security vulnerabilities that a sufficiently clever agent could discover. Such assumptions need not be made explicitly โ more likely, they are details that simply never occurred to the designer.
Artificial Intelligence and music creation: What is OpenAI's Jukebox? Purple Sneakers
The future is now people. Not only do we have pandemic-proof rave suits being designed, we also now might be on the precipice of having music released made with Artificial Intelligence thanks to the latest development from OpenAI. Aptly titled'Jukebox', the new model is now able to generate genre-specific music. According to OpenAI's website, Jukebox is "a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles." Using over 1.6million songs as their dataset, Jukebox is able to use a song provided as input, and generate a sample produced from scratch in specific genres as output.
Fake News' Foe: Machine Learning and Twilio - DZone AI
Fake news has become a huge issue in our digitally-connected world and it is no longer limited to little squabbles -- fake news spreads like wildfire and is impacting millions of people every day. How do you deal with such a sensitive issue? Countless articles are being churned out every day on the internet -- how do you tell real from fake? It's not as easy as turning to a simple fact-checker which is typically built on a story-by-story basis. In this series, we will see two approaches to predict if a given article is fake or not.
DeepMind compares the way children and AI explore
In a preprint paper, researchers at Alphabet's DeepMind and the University of California, Berkeley propose a framework for comparing the ways children and AI learn about the world. The work, which was motivated by research suggesting children's learning supports behaviors later in life, could help close the gap between AI and humans when it comes to acquiring new abilities. For instance, it might lead to robots that can pick and pack millions of different kinds of products while avoiding various obstacles. Exploration is a key feature of human behavior, and recent evidence suggests children explore their surroundings more often than adults. This is thought to translate to more learning that enables powerful, abstract task generalization -- a type of generalization AI agents could tangibly benefit from.
How Microsoft, OpenAI, and OECD are putting AI ethics principles into practice
Microsoft's AI ethics committee helped craft internal Department of Defense contract policy, and G20 member nations wouldn't have passed AI ethics principles if it weren't for Japanese leadership. Published Tuesday, the UC Berkeley Center for Long-Term Cybersecurity (CLTC) case study examines how organizations are putting AI ethics principles into practice. Ethics principles are often vaguely phrased rules that can be challenging to translate into the daily practices of an engineer or other frontline worker. CLTC research fellow Jessica Cussins Newman told VentureBeat that many AI ethics and governance debates have focused more on what is needed, but less on the practices and policies necessary to implement goals enshrined in principles. The study focuses on OpenAI's rollout of GPT-2; the adoption of AI principles by OECD and G20; and the creation of the AI, Ethics, and Effects in Engineering and Research (AETHER) committee at Microsoft.