Generative AI
What Jobs Will OpenAI's New GPT-3 Disrupt First - TectoGizmo
How Does a Neural Network Work? A neural net is not much more than a collection of logical units that each has an input weight and a weight per connector. If the input weight multiplied by the connector weight is more than a threshold value defined, the unit is set to fire, which then triggers the unit to its right with a new input value. In the graph on the right, due to the fact that not all threshold values are exceeded, the tree will shrink until only a couple of output units or even only one have fired. This is the basis of how it learns.
OpenAI proposes using reciprocity to encourage AI agents to work together
Many real-world problems require complex coordination between multiple agents -- e.g., people or algorithms. A machine learning technique called multi-agent reinforcement learning (MARL) has shown success with respect to this, mainly in two-team games like Go, DOTA 2, Starcraft, hide-and-seek, and capture the flag. But the human world is far messier than games. That's because humans face social dilemmas at multiple scales, from the interpersonal to the international, and they must decide not only how to cooperate but when to cooperate. To address this challenge, researchers at OpenAI propose training AI agents with what they call randomized uncertain social preferences (RUSP), an augmentation that expands the distribution of environments in which reinforcement learning agents train.
Microsoft and OpenAI propose automating U.S. tech export controls
Microsoft and OpenAI, the AI research lab in which Microsoft has invested over $1 billion, today submitted a document to the U.S. government describing how a "digitally transformed" export controls system might work and the benefits it could provide. The organizations suggest that their proposed solutions could bring commercial benefits to users, as well as a more powerful, dynamic, and targeted method for controlling U.S. exports of fundamental technologies. Following a mandate in the Export Control Reform Act of 2018, the U.S. Department of Commerce's Bureau of Industry and Security (BIS) undertook efforts to identify and control exports of "emerging" or "foundational" technologies ostensibly vital to national security. In comment periods ending in January 2019 and earlier this week, BIS solicited comment from the industry on how to identify and approach control of these technologies. Microsoft and OpenAI take issue with the restrictions promulgated via traditional export control approaches.
AI is wrestling with a replication crisis
In practice, few studies are fully replicated because most researchers are more interested in producing new results than reproducing old ones. But in fields like biology and physics--and computer science overall--researchers are typically expected to provide the information needed to rerun experiments, even if those reruns are rare. AI is feeling the heat for several reasons. For a start, it is a newcomer. It has only really become an experimental science in the past decade, says Joelle Pineau, a computer scientist at Facebook AI Research and McGill University, who coauthored the complaint.
AI Jukebox creates 'deepfake' songs, imitating dead pop stars
Artificial intelligence (AI) is being used to create new'deepfake' pop songs that sound like they're being performed by dead musicians, including Elvis Presley, Frank Sinatra, David Bowie and Michael Jackson. Jukebox, created by California-based company OpenAI, is a neural network that generates eerie approximates of pop songs in the style of multiple artists. The neural network generates music, including rudimentary singing complete with lyrics in English and a variety of instruments like guitar and piano. OpenAI has created a expansive library of new tracks, imitating a diverse selection of artists, including the Beatles, Nirvana, Katy Perry, Simon and Garfunkel, Stevie Wonder, Elton John and Ed Sheeran, as well as deceased heroes that almost appear to be brought back to life. Most of the samples have a bizarre, faraway quality to them, as if they're poorly produced demos from the 1950s that haven't seen the light of day until now.
JukeBox by OpenAI.
Not quite the imitation of existing performers or interpretation of famous pieces -- but the discovery of hidden gems. Uncanny Valley is a passรฉ. Indeed, the works are unique: every time a new never before existed music piece is generated -- and you can be sure (like in the case of GPT-3) that this sequence will never be repeated. My first experiment brought me goosebumps. Already the 2nd level was something special, not really in a way of music pieces.
Reinforcement Learning with Augmented Data
Laskin, Michael, Lee, Kimin, Stooke, Adam, Pinto, Lerrel, Abbeel, Pieter, Srinivas, Aravind
Learning from visual observations is a fundamental yet challenging problem in Reinforcement Learning (RL). Although algorithmic advances combined with convolutional neural networks have proved to be a recipe for success, current methods are still lacking on two fronts: (a) data-efficiency of learning and (b) generalization to new environments. To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms. We perform the first extensive study of general data augmentations for RL on both pixel-based and state-based inputs, and introduce two new data augmentations - random translate and random amplitude scale. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks. RAD sets a new state-of-the-art in terms of data-efficiency and final performance on the DeepMind Control Suite benchmark for pixel-based control as well as OpenAI Gym benchmark for state-based control. We further demonstrate that RAD significantly improves test-time generalization over existing methods on several OpenAI ProcGen benchmarks.
Microsoft is granted exclusive rights to use OpenAI's GPT-3
Microsoft and OpenAI's close relationship has taken another leap forward with the former gaining exclusive GPT-3 access. GPT-3 has been the talk of the AI town in recent months. OpenAI's innovation can help to create convincing articles and the company once deemed it too dangerous to release in a world where misinformation and fake news is already problematic. OpenAI never made GPT-3 publicly available but instead provided access to a limited number of trusted researchers. Microsoft announced today that it now has the exclusive rights to leverage GPT-3's "technical innovations to develop and deliver advanced AI solutions for our customers, as well as create new solutions that harness the amazing power of advanced natural language generation."
Google, OpenAI & DeepMind: Shared Task Behaviour Priors Can Boost RL and Generalization
Researchers in recent years have deployed reinforcement learning (RL) agents to solve increasingly challenging problems. As the trend continues, so has the development of new methods that enable the injection of "priors" (prior knowledge) into agents to help them better understand the structure of the world and come up with more effective solution strategies. In a new paper, researchers from Google, OpenAI, and DeepMind introduce "behaviour priors," a framework designed to capture common movement and interaction patterns that are shared across a set of related tasks or contexts. The researchers discuss how such behaviour patterns can be captured using probabilistic trajectory models and how they can be integrated effectively into RL schemes, such as for facilitating multi-task and transfer learning. Their method for learning behaviour priors can lead to significant speedups on complex tasks, the researchers say.
PettingZoo: Gym for Multi-Agent Reinforcement Learning
Terry, Justin K., Black, Benjamin, Jayakumar, Mario, Hari, Ananth, Santos, Luis, Dieffendahl, Clemens, Williams, Niall L., Lokesh, Yashas, Sullivan, Ryan, Horsch, Caroline, Ravi, Praveen
This paper introduces PettingZoo, a library of diverse sets of multi-agent environments under a single elegant Python API. PettingZoo was developed with the goal of accelerating research in multi-agent reinforcement learning, by creating a set of benchmark environments easily accessible to all researchers and a standardized API for the field. This goal is inspired by what OpenAI's Gym library did for accelerating research in single-agent reinforcement learning, and PettingZoo draws heavily from Gym in terms of API and user experience. PettingZoo is unique from other multi-agent environment libraries in that it's API is based on the model of Agent Environment Cycle ("AEC") games, which allows for the sensible representation of all varieties of games under one API for the first time. While retaining a very simple and Gym-like API, PettingZoo still allows access to low-level environment properties required by nontraditional learning methods. Reinforcement Learning ("RL") considers learning a policy -- a function that takes in an observation from an environment and emits an action -- that achieves the maximum expected discounted reward when acting in an environment, and it's capabilities have been one of the great success of modern machine learning. Multi-Agent Reinforcement Learning (MARL) in particular has been behind many of the most publicized achievements of modern machine learning -- AlphaGo Zero (Silver et al., 2017), OpenAI Five (OpenAI, 2018), AlphaStar (Vinyals et al., 2019) -- and has seen a boom in recent years.