Generative AI
Better Language Models and Their Implications
We've trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization--all without task-specific training. Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1] of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data. GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. On language tasks like question answering, reading comprehension, summarization, and translation, GPT-2 begins to learn these tasks from the raw text, using no task-specific training data.
OpenAI Tried to Train AI Agents to Play Hide-And-Seek but Instead They Were Shocked by What They Learned
Competition is one of the socio-economic dynamics that has influenced our evolutions as species. The vast amount of complexity and diversity on Earth evolved due to co-evolution and competition between organisms, directed by natural selection. By competing against a different party, we are constantly forced to improve our knowledge and skills on a specific subject. Recent developments in artificial intelligence(AI) have started to leverage some of the principles of competition to influence learning behaviors in AI agents. Specifically, the field of multi-agent reinforcement learning(MARL) has been heavily influenced by the competitive and game-theoretic dynamics.
AI Learns to Cheat at Hide and Seek #OpenAI #HideandSeek #MachineLearning #ArtificialIntelligence #ReinforcementLearning @OpenAI
OpenAI recently posted on Twitter about teaching computer agents'hide and seek'. We've observed AIs discovering complex tool use while competing in a simple game of hide-and-seek. They develop a series of six distinct strategies and counter strategies, ultimately using tools in the environment to break our simulated physics. In the simulations, seekers are incentivized to maintain line of sight of hiders and hiders are incentivized to avoid line of sight from seekers. The agents environments contain various shelters including cubicles, movable partitions, blocks and ramps. That said, there is no built-in incentive for agents to interact with objects around them.
Emergent Tool Use from Multi-Agent Interaction
In our environment, agents play a team-based hide-and-seek game. Hiders (blue) are tasked with avoiding line-of-sight from the seekers (red), and seekers are tasked with keeping vision of the hiders. There are objects scattered throughout the environment that hiders and seekers can grab and lock in place, as well as randomly generated immovable rooms and walls that agents must learn to navigate. Before the game begins, hiders are given a preparation phase where seekers are immobilized to give hiders a chance to run away or change their environment. There are no explicit incentives for agents to interact with objects in the environment; the only supervision given is through the hide-and-seek objective.
AI Hide and Seek: Agents Punched Holes in their Creators' Universe
Bots removed opponents' tools from the game space, and launched themselves into the airโฆ Two teams of AI agents tasked with playing a game (or million) of hide and seek in a virtual environment developed complex strategies and counterstrategies โ and exploited holes in their environment that even its creators didn't even know that it had. The game was part of an experiment by OpenAI designed to test the AI skills that emerge from multi-agent competition and standard reinforcement learning algorithms at scale. OpenAI described the outcome in a striking paper published this week. The organisation, now heavily backed by Microsoft, described the outcome as further proof that "skills, far more complex than the seed game dynamics and environment, can emerge" (from such experiments/training exercises). Some of its findings are neatly captured in the video below.
TWIMLcon: AI Platforms - Machine and deep learning in the enterprise
TWIMLcon: AI Platforms is brought to you by the team behind the TWIML AI Podcast (a.k.a. The conference has its roots in a series of interviews on the topic of AI Platforms published back in the fall of 2018. The series--which featured interviews with ML Platforms and Infrastructure engineers and leaders from Facebook, Airbnb, LinkedIn, OpenAI, Shell and Comcast--resonated very strongly with listeners and remains one of our most popular series to this day. We're excited to convene TWIMLcon: AI Platforms and provide the broader community of folks that care about productionalizing, operationalizing and scaling ML & AI an opportunity to share, learn, and connect with one another.
Why Playing Hide-and-Seek Could Lead AI to Humanlike Intelligence
Humans are a species that can adapt to environmental challenges, and over eons this has enabled us to biologically evolve -- an essential characteristic found in animals but absent in AI. Although machine learning has made remarkable progress in complex games such as Go and Dota 2, the skills mastered in these arenas do not necessarily generalize to practical applications in real-world scenarios. The goal for a growing number of researchers is to build a machine intelligence that behaves, learns and evolves more like humans. A new paper from San Francisco-based OpenAI proposes that training models in the children's game of hide-and-seek and pitting them against each other in tens of millions of contests results in the models automatically developing humanlike behaviors that increase their intelligence and improve subsequent performance. Hide-and-seek was selected as a fun starting point mostly due to its simple rules, says the paper's first author, OpenAI Researcher Bowen Baker.
AI Learns to Defy Laws of Physics to Win at Hide-and-Seek
Researchers at the OpenAI artificial intelligence laboratory developed bots that trained themselves to cooperate by playing hide-and-seek. Scientists at the OpenAI artificial intelligence (AI) laboratory have developed AI bots that trained themselves to cooperate by playing hide-and-seek. The team had the bots play the game in a simulated environment containing fixed walls and movable boxes; each bot had its own perspective of its surroundings, and could not directly communicate with other bots. The bots that hid quickly deduced the fastest way to fool seekers was to find objects in the environment with which to conceal themselves; the seekers learned they could manipulate objects like ramps to overcome obstacles like walls. The bots learned that cooperation--like passing objects to each other or co-building a hideout--was the quickest way to win.
Microsoft dumps $1 billion into 'artificial general intelligence' project
Microsoft announced a $1 billion investment in OpenAI, a lab co-founded by Elon Musk to develop "artificial general intelligence." The investment is the start of a long-term partnership between the two organizations. OpenAI will ensure its services work on Microsoft's Azure cloud platform, and the companies will collaborate on new supercomputers. OpenAI's stated mission is to develop "artificial general intelligence," or AGI. In layman's terms, AGI is AI that can think like a human (possibly even better) while carrying out complex tasks autonomously. Whether or not an AGI would immediately decide to incinerate humanity a la Skynet remains to be seen, but OpenAI at least claims its artificial intelligence would be safe and beneficial for the human race.
AIs use hide-and-seek to learn to tackle real-world problems
Pitting two artificial intelligences against each other in games such as DeepMind's Go has led to some of the biggest breakthroughs in AI in recent years, as the machines learn skills through trial and error that eventually lead to them beating humans. But can the same technique produce a more useful AI capable of operating in the real word? OpenAI, a San Francisco-based AI research group, published research on Tuesday showing what it claimed was a method for training increasingly powerful smart systems that could prepare them for tackling more ordinary human problems. Set in increasingly realistic environments, the technique points to a way for the AI to "evolve" in a simulated world until it is ready to be used, it said. The researchers used several intelligent "agents" in a game of hide-and-seek played in a simulated physical environment.