Goto

Collaborating Authors

 Generative AI


Variational Mixture of Normalizing Flows

arXiv.org Machine Learning

In the past few years, deep generative models, such as generative adversarial networks \autocite{GAN}, variational autoencoders \autocite{vaepaper}, and their variants, have seen wide adoption for the task of modelling complex data distributions. In spite of the outstanding sample quality achieved by those early methods, they model the target distributions \emph{implicitly}, in the sense that the probability density functions induced by them are not explicitly accessible. This fact renders those methods unfit for tasks that require, for example, scoring new instances of data with the learned distributions. Normalizing flows have overcome this limitation by leveraging the change-of-variables formula for probability density functions, and by using transformations designed to have tractable and cheaply computable Jacobians. Although flexible, this framework lacked (until recently \autocites{semisuplearning_nflows, RAD}) a way to introduce discrete structure (such as the one found in mixtures) in the models it allows to construct, in an unsupervised scenario. The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model. This procedure is based on variational inference, and uses a variational posterior parameterized by a neural network. As will become clear, this model naturally lends itself to (multimodal) density estimation, semi-supervised learning, and clustering. The proposed model is illustrated on two synthetic datasets, as well as on a real-world dataset. Keywords: Deep generative models, normalizing flows, variational inference, probabilistic modelling, mixture models.


What does GPT-3 mean for the future of the legal profession? โ€“ TechCrunch

#artificialintelligence

One doesn't have to dig too deep into legal organizations to find people who are skeptical about artificial intelligence. AI is getting tremendous attention and significant venture capital, but AI tools frequently underwhelm in the trenches. Here are a few reasons why that is and why I believe GPT-3, a beta version of which was recently released by the OpenAI Foundation, might be a game changer in legal and other knowledge-focused organizations. GPT-3 is getting a lot of oxygen lately because of its size, scope and capabilities. However, it should be recognized that a significant amount of that attention is due to its association with Elon Musk.


GPT-3 in tweets

AIHub

Since OpenAI released GPT-3, you have probably come across examples of impressive and/or problematic content that people have used the model to generate. Here we summarise the outputs of GPT-3 as seen through the eyes of the Twitter-sphere. We're releasing an API for accessing new AI models developed by OpenAI. See how companies are using the API today, or join our waitlist: https://t.co/SvTgaFuTzN First, let me summarize the API documentation.


[N] GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about

#artificialintelligence

The only way you can really "debug" humans is conversationally though? If some employee screwed up the assembly line and messed up production, you might call him into the office and ask him why he made that mistake, you could get the same "oh, I'm sorry I got distracted and wasn't paying attention.", There could be some bad synaptic weights that caused some of his neurons to misfire and cause him to make the mistake. This is just explained by "getting distracted". I don't know if you've played around with GPT-3, but if you push it on something it's gotten wrong, it usually gets very defensive and will bullshit it's way out of it, just as well as a human would.


An AI Breaks the Writing Barrier

#artificialintelligence

Word has been making its way out from the technology community: The world changed this summer with the rollout of an artificial intelligence system known as GPT-3. Its ability to interact in English and generate coherent writing have been startling hardened experts, who speak of "GPT-3 shock." Where typical AI systems are trained for specific tasks--classifying images, playing Go--GPT-3 can handle tasks it was never specifically trained for. Research released by its maker, San Francisco-based OpenAI, has found that GPT-3 can...


GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about

#artificialintelligence

Since OpenAI first described its new AI language-generating system called GPT-3 in May, hundreds of media outlets (including MIT Technology Review) have written about the system and its capabilities. Twitter has been abuzz about its power and potential. The New York Times published an op-ed about it. Later this year, OpenAI will begin charging companies for access to GPT-3, hoping that its system can soon power a wide variety of AI products and services. Is GPT-3 an important step toward artificial general intelligence--the kind that would allow a machine to reason broadly in a manner similar to humans without having to train for every specific task it encounters?


Graph Representation Learning Book

#artificialintelligence

The field of graph representation learning has grown at an incredible (and sometimes unwieldy) pace over the past seven years, transforming from a small subset of researchers working on a relatively niche topic to one of the fastest growing sub-areas of deep learning. This book is my attempt to provide a brief but comprehensive introduction to graph representation learning, including methods for embedding graph data, graph neural networks, and deep generative models of graphs. This book is a pre-publication draft of a book that will be published by Morgan & Claypool publishers in late 2020, and the publishers have generously agreed to allow the public hosting of the pre-publication draft. Feedback, typo corrections, and comments are welcome and should be sent to wlh@cs.mcgill.ca


Andrej Karpathy releases concise GPT implementation. Why has he bothered to do this: doesn't he work for OpenAI, at least indirectly? [D] [N]

#artificialintelligence

It's nice to see a concise implementation of GPT, in pytorch, as it is true Hugging Face's Transformer's is excellent, but it is quite difficult to trace. They are trying to build it out constantly with loads of features, so you get lost. His wiki states he works for OpenAI and Tesla is at least affiliated with Openai. Also it's very far from computer vision domain, so why spend the time on an open source implementation and make some guesses on GPT-2/GPT-3. His implementation is easy to follow, which is nice, most reimplementations I see have bugs or are unecessary complex.


The untold story of GPT-3 is the transformation of OpenAI

#artificialintelligence

A bot that writes letters on behalf of nature. Those are just some of the recent stories written about GPT-3, the latest contraption of artificial intelligence research lab OpenAI. GPT-3 is the largest language model ever made, and it has triggered many discussions over how AI will soon transform many industries. But what has been less discussed is how GPT-3 has transformed OpenAI itself. In the process of creating the most successful natural language processing system ever created, OpenAI has gradually morphed from a nonprofit AI lab to a company that sells AI services. And hanging in the balance is the very mission for which OpenAI was founded.


Evaluating Lossy Compression Rates of Deep Generative Models

arXiv.org Machine Learning

The field of deep generative modeling has succeeded in producing astonishingly realistic-seeming images and audio, but quantitative evaluation remains a challenge. Log-likelihood is an appealing metric due to its grounding in statistics and information theory, but it can be challenging to estimate for implicit generative models, and scalar-valued metrics give an incomplete picture of a model's quality. In this work, we propose to use rate distortion (RD) curves to evaluate and compare deep generative models. While estimating RD curves is seemingly even more computationally demanding than log-likelihood estimation, we show that we can approximate the entire RD curve using nearly the same computations as were previously used to achieve a single log-likelihood estimate. We evaluate lossy compression rates of VAEs, GANs, and adversarial autoencoders (AAEs) on the MNIST and CIFAR10 datasets. Measuring the entire RD curve gives a more complete picture than scalar-valued metrics, and we arrive at a number of insights not obtainable from log-likelihoods alone.