Generative AI
Sepsis World Model: A MIMIC-based OpenAI Gym "World Model" Simulator for Sepsis Treatment
Kiani, Amirhossein, Wang, Chris, Xu, Angela
Sepsis is a life-threatening condition caused by the body's response to an infection. In order to treat patients with sepsis, physicians must control varying dosages of various antibiotics, fluids, and vasopressors based on a large number of variables in an emergency setting. In this project we employ a "world model" methodology to create a simulator that aims to predict the next state of a patient given a current state and treatment action. In doing so, we hope our simulator learns from a latent and less noisy representation of the EHR data. Using historical sepsis patient records from the MIMIC dataset, our method creates an OpenAI Gym simulator that leverages a Variational Auto-Encoder and a Mixture Density Network combined with a RNN (MDN-RNN) to model the trajectory of any sepsis patient in the hospital. To reduce the effects of noise, we sample from a generated distribution of next steps during simulation and have the option of introducing uncertainty into our simulator by controlling the "temperature" variable. It is worth noting that we do not have access to the ground truth for the best policy because we can only evaluate learned policies by real-world experimentation or expert feedback. Instead, we aim to study our simulator model's performance by evaluating the similarity between our environment's rollouts with the real EHR data and assessing its viability for learning a realistic policy for sepsis treatment using Deep Q-Learning.
Learning Deep Generative Models with Short Run Inference Dynamics
Nijkamp, Erik, Pang, Bo, Han, Tian, Zhou, Linqi, Zhu, Song-Chun, Wu, Ying Nian
This paper studies the fundamental problem of learning deep generative models that consist of one or more layers of latent variables organized in top-down architectures. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference typically requires Markov chain Monte Caro (MCMC) that can be time consuming. In this paper, we propose to use short run inference dynamics guided by the log-posterior, such as finite-step gradient descent algorithm initialized from the prior distribution of the latent variables, as an approximate sampler of the posterior distribution, where the step size of the gradient descent dynamics is optimized by minimizing the Kullback-Leibler divergence between the distribution produced by the short run inference dynamics and the posterior distribution. Our experiments show that the proposed method outperforms variational auto-encoder (VAE) in terms of reconstruction error and synthesis quality. The advantage of the proposed method is that it is natural and automatic, even for models with multiple layers of latent variables.
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI, null, :, null, Berner, Christopher, Brockman, Greg, Chan, Brooke, Cheung, Vicki, Dฤbiak, Przemysลaw, Dennison, Christy, Farhi, David, Fischer, Quirin, Hashme, Shariq, Hesse, Chris, Jรณzefowicz, Rafal, Gray, Scott, Olsson, Catherine, Pachocki, Jakub, Petrov, Michael, Pinto, Henrique Pondรฉ de Oliveira, Raiman, Jonathan, Salimans, Tim, Schlatter, Jeremy, Schneider, Jonas, Sidor, Szymon, Sutskever, Ilya, Tang, Jie, Wolski, Filip, Zhang, Susan
On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of approximately 2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.
The PlayStation Reinforcement Learning Environment (PSXLE)
Purves, Carlos, Cangea, Cฤtฤlina, Veliฤkoviฤ, Petar
We propose a new benchmark environment for evaluating Reinforcement Learning (RL) algorithms: the PlayStation Learning Environment (PSXLE), a PlayStation emulator modified to expose a simple control API that enables rich game-state representations. We argue that the PlayStation serves as a suitable progression for agent evaluation and propose a framework for such an evaluation. We build an action-driven abstraction for a PlayStation game with support for the OpenAI Gym interface and demonstrate its use by running OpenAI Baselines.
Memento Learning: How OpenAI Created AI Agents that can Learn by Going Backwards
Memento broke many of the traditional paradigms in the film industry by structuring two parallel narratives, one chronologically going backwards and one going forward. The novel form narrative implemented in Memento forces the audience to constantly reevaluate their knowledge of the plot and they keep learning small details every few minutes of the film. It turns out that replaying a knowledge sequence backwards for small time intervals is an incredibly captivating method of learning. Intuitively, the Memento form of learning seems like perfect for AI agents. Last year, researchers from OpenAI leveraged that learning methodology to created AI agents that learned to play Montezuma's Revenge using a single demonstration.
Multimodal Generative Models for Compositional Representation Learning
As deep neural networks become more adept at traditional tasks, many of the most exciting new challenges concern multimodality--observations that combine diverse types, such as image and text. In this paper, we introduce a family of multimodal deep generative models derived from variational bounds on the evidence (data marginal likelihood). As part of our derivation we find that many previous multimodal variational autoencoders used objectives that do not correctly bound the joint marginal likelihood across modalities. We further generalize our objective to work with several types of deep generative model (V AE, GAN, and flow-based), and allow use of different model types for different modalities. We benchmark our models across many image, label, and text datasets, and find that our multimodal V AEs excel with and without weak supervision. Additional improvements come from use of GAN image models with V AE language models. Finally, we investigate the effect of language on learned image representations through a variety of downstream tasks, such as compositionally, bounding box prediction, and visual relation prediction. We find evidence that these image representations are more abstract and compositional than equivalent representations learned from only visual data.
Deep Double Descent
We show that the double descent phenomenon occurs in CNNs, ResNets, and transformers: performance first improves, then gets worse, and then improves again with increasing model size, data size, or training time. This effect is often avoided through careful regularization. While this behavior appears to be fairly universal, we don't yet fully understand why it happens, and view further study of this phenomenon as an important research direction. The peak occurs predictably at a "critical regime," where the models are barely able to fit the training set. As we increase the number of parameters in a neural network, the test error initially decreases, increases, and, just as the model is able to fit the train set, undergoes a second descent.
Axios Future
A recently released AI program that generates hyper-realistic writing has become a powerful tool for storytelling, hinting at a new genre of computer-aided creativity. What's happening: Inventive programmers are using it to generate poetry, interactive text adventures, and even irreverent new prompts for the popular game Cards Against Humanity. The big picture: AI-written text is reaching new levels of realism -- so much so that when scientists at OpenAI released a groundbreaking text generator earlier this year, they warned of potential dangers from mass-produced fake news. The risks are still present, but recent projects demonstrate the creative upsides. How it works: The OpenAI language model is a bit like autocomplete: Based on an enormous amount of human writing, it predicts the best words to generate next.
Why People Are So Overwhelmed by AWS Latest Musical Keyboard Powered By Generative AI
As much as a programmer likes machine learning, there must come a time when they are overwhelmed by the study process. All the coding, maths and infrastructure of it might make one reach out for that extra cup of coffee. Now, e-commerce giant Amazon has made the world of generative artificial intelligence a little easier to understand by introducing its machine learning-powered MIDI-compatible keyboard, DeepComposer. AWS DeepComposer is a 32-key, 2-octave keyboard design. This ML keyboard offers developers to experience generative AI in a better way.
You can do nearly anything you want in this incredible AI-powered game
Yes, you read that right. Where all games are limited by what developers program, this game uses OpenAI to create infinitely generated worlds that are only limited by your imagination. If that sounds lofty, well, give it a try. I promise that you'll be amused. AI Dungeon 2 gives you a few different settings and roles that you can adopt, but from there, it pretty much lets you run wild.