Generative AI
openai/gpt-3
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions – something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting.
OpenAI's GPT-3 may be the biggest thing since bitcoin
OpenAI, a non-profit artificial intelligence research company backed by Peter Thiel, Elon Musk, Reid Hoffman, Marc Benioff, Sam Altman and others, released its third generation of language prediction model (GPT-3) into the open-source wild. Language models allow computers to produce random-ish sentences of approximately the same length and grammatical structure as those in a given body of text. In my early experiments with GPT-3 I found that GPT-3's predicted sentences, when published on the bitcointalk.org I imagine that similar results can be obtained by republishing GPT-3's outputs to other message boards, blogs, and social media. I predict that, unlike its two predecessors (PTB and OpenAI GPT-2), OpenAI GPT-3 will eventually be widely used to pretend the author of a text is a person of interest, with unpredictable and amusing effects on various communities.
A Systematic Survey on Deep Generative Models for Graph Generation
Graphs are important data representations for describing objects and their relationships, which appear in a wide diversity of real-world scenarios. As one of a critical problem in this area, graph generation considers learning the distributions of given graphs and generating more novel graphs. Owing to its wide range of applications, generative models for graphs have a rich history, which, however, are traditionally hand-crafted and only capable of modeling a few statistical properties of graphs. Recent advances in deep generative models for graph generation is an important step towards improving the fidelity of generated graphs and paves the way for new kinds of applications. This article provides an extensive overview of the literature in the field of deep generative models for the graph generation. Firstly, the formal definition of deep generative models for the graph generation as well as preliminary knowledge is provided. Secondly, two taxonomies of deep generative models for unconditional, and conditional graph generation respectively are proposed; the existing works of each are compared and analyzed. After that, an overview of the evaluation metrics in this specific domain is provided. Finally, the applications that deep graph generation enables are summarized and five promising future research directions are highlighted.
OpenAI's fiction-spewing AI is learning to generate images
At its core, GPT-2 is a powerful prediction engine. It learned to grasp the structure of the English language by looking at billions of examples of words, sentences, and paragraphs, scraped from the corners of the internet. With that structure, it could then manipulate words into new sentences by statistically predicting the order in which they should appear. So researchers at OpenAI decided to swap the words for pixels and train the same algorithm on images in ImageNet, the most popular image bank for deep learning. Because the algorithm was designed to work with one-dimensional data (i.e., strings of text), they unfurled the images into a single sequence of pixels.
Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation
Ran, Xuming, Xu, Mingkun, Mei, Lingrui, Xu, Qi, Liu, Quanying
In unsupervised learning, variational auto-encoders (VAEs) are an influential class of deep generative models with rich representational power of neural networks and Bayesian methods. However, VAEs suffer from assigning higher likelihood to out-of-distribution (OOD) inputs than in-distribution (ID) inputs. Recent studies advise that the deep generative models with reliable uncertainty estimation is critical to a deep understanding of OOD inputs. Meanwhile, noise contrastive prior (NCP) is an emerging promising method for obtaining uncertainty, with the advantages of easy to scale, being trainable, and compatibility with extensive models. Inspired by these ideas, We propose an improved noise contrastive prior (INCP) to acquire reliable uncertainty estimate for standard VAEs. By combining INCP with the encoder of VAE, patterns between OOD and ID inputs can be well captured and distinguished. Our method outperforms standard VAEs on the FashionMNIST and CIFAR10 datasets. We also demonstrate the preferred robustness of our model by the extensive experiments on anomaly detection tasks.
VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven Model Interpretability Applied to the Ironmaking Industry
Schockaert, Cedric, Macher, Vadim, Schmitz, Alexander
Machine learning applied to generate data-driven models are lacking of transparency leading the process engineer to lose confidence in relying on the model predictions to optimize his industrial process. Bringing processes in the industry to a certain level of autonomy using data-driven models is particularly challenging as the first user of those models, is the expert in the process with often decades of experience. It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability. To that end, several approaches have been proposed in the literature. The Local Interpretable Model-agnostic Explanations (LIME) method has gained a lot of interest from the research community recently. The principle of this method is to train a linear model that is locally approximating the black-box model, by generating randomly artificial data points locally. Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method. We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace. Such ironmaking process data is characterized by multivariate time series with high inter-correlation representing the underlying process in a blast furnace. Our contribution is to use a Variational Autoencoder (VAE) to learn the complex blast furnace process characteristics from the data. The VAE is aiming at generating optimal artificial samples to train a local interpretable model better representing the black-box model in the neighborhood of the input sample processed by the black-box model to make a prediction. In comparison with LIME, VAE-LIME is showing a significantly improved local fidelity of the local interpretable linear model with the black-box model resulting in robust model interpretability.
Image GPT
We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and image classification accuracy, we show that our best generative model also contains features competitive with top convolutional nets in the unsupervised setting. Unsupervised and self-supervised learning, or learning without human-labeled data, is a longstanding challenge of machine learning. Recently, it has seen incredible success in language, as transformer models like BERT, GPT-2, RoBERTa, T5, and other variants have achieved top performance on a wide array of language tasks. However, the same broad class of models has not been successful in producing strong features for image classification.
Generative AI: A Key to Machine Intelligence?
We're living in the age of the next industrial revolution: the very first three freed most of the humans from hard labor. This one is aiming to take us over the last domain of human dominance on this planet: our intelligence. In this article, we will put aside ethical, political and social effects of such revolution and concentrate a bit more on the technical side of it. What we see in media today looks a bit different from the real dominance of machines over humans… or not? The most rapidly growing areas of artificial intelligence in the few last years have been computer vision, natural language processing, speech processing and, of course, different customer analytics applications like recommender systems (you may not like it, but targeted advertisements are accurate enough to grow companies' revenues).
Getting Artificial Neural Networks Closer to Animal Brains
Lately, I've been thinking and reading a lot about consciousness and how the human mind works. A question that emerges all the time is whether machines can emulate human thought. An even more interesting one is whether consciousness (a subjective experience) can arise from a machine, but I'll leave that discussion for a future post (I'll need 20 more years to think about that before I can write about it). So, how far are we from _behaviorally _imitating a human? Truth is, we achieved a lot in the past 5 years (see AlphaGo, OpenGPT-2, OpenAI Jukebox, Tesla Autopilot, Alphastar, OpenAI Dota2 Team, OpenAI API), but we're still quite not there.