AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

Discovering Types for Entity Disambiguation

#artificialintelligenceFeb-8-2018, 05:15:37 GMT

Using the top solution from our type system optimization, we can now label data from Wikipedia using labels generated by the type system. Using this data (in our experiments, 400M tokens for each of English and French), we can now train a bidirectional LSTM to independently predict all the type memberships for each word. On the Wikipedia source text, we only have supervision on intra-wiki links, however this is sufficient to train a deep neural network to predict type membership with an F1 of over 0.91. One of our type systems, discovered by beam search, includes types such as Aviation, Clothing, and Games (as well as surprisingly specific ones like 1754 in Canada -- indicating 1754 was an exciting year in the dataset of 1,000 Wikipedia articles it was trained on); you can also view the full type system. Predicting entities in a document usually relies on a "coherence" metric between different entities, e.g.

artificial intelligence, machine learning, type system, (6 more...)

#artificialintelligence

Country: North America > Canada (0.27)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.43)

Add feedback

Zero-Shot Learning via Class-Conditioned Deep Generative Models

AAAI ConferencesFeb-8-2018

We present a deep generative model for Zero-Shot Learning (ZSL). Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen class using a class-specific latent-space distribution, conditioned on class attributes. We use these latent-space distributions as a prior for a supervised variational autoencoder (VAE), which also facilitates learning highly discriminative feature representations for the inputs. The entire framework is learned end-to-end using only the seen-class training data. At test time, the label for an unseen-class test input is the class that maximizes the VAE lower bound. We further extend the model to a (i) semi-supervised/transductive setting by leveraging unlabeled unseen-class data via an unsupervised learning module, and (ii) few-shot learning where we also have a small number of labeled inputs from the unseen classes. We compare our model with several state-of-the-art methods through a comprehensive set of experiments on a variety of benchmark data sets.

artificial intelligence, machine learning, natural language, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Semi-Supervised Learning From Crowds Using Deep Generative Models

Atarashi, Kyohei (Hokkaido University) | Oyama, Satoshi (Hokkaido University) | Kurihara, Masahito (RIKEN AIP)

AAAI ConferencesFeb-8-2018

Although supervised learning requires a labeled dataset, obtaining labels from experts is generally expensive. For this reason, crowdsourcing services are attracting attention in the field of machine learning as a way to collect labels at relatively low cost. However, the labels obtained by crowdsourcing, i.e., from non-expert workers, are often noisy. A number of methods have thus been devised for inferring true labels, and several methods have been proposed for learning classifiers directly from crowdsourced labels, referred to as "learning from crowds." A more practical problem is learning from crowdsourced labeled data and unlabeled data, i.e., "semi-supervised learning from crowds." This paper presents a novel generative model of the labeling process in crowdsourcing. It leverages unlabeled data effectively by introducing latent features and a data distribution. Because the data distribution can be complicated, we use a deep neural network for the data distribution. Therefore, our model can be regarded as a kind of deep generative model. The problems caused by the intractability of latent variable posteriors is solved by introducing an inference model. The experiments show that it outperforms four existing models, including a baseline model, on the MNIST dataset with simulated workers and the Rotten Tomatoes movie review dataset with Amazon Mechanical Turk workers.

artificial intelligence, classifier, machine learning, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Metrics for Deep Generative Models

Chen, Nutan, Klushyn, Alexej, Kurle, Richard, Jiang, Xueyan, Bayer, Justin, van der Smagt, Patrick

arXiv.org Machine LearningFeb-8-2018

Neural samplers such as variational autoencoders (VAEs) or generative adversarial networks (GANs) approximate distributions by transforming samples from a simple random source---the latent space---to samples from a more complex distribution represented by a dataset. While the manifold hypothesis implies that the density induced by a dataset contains large regions of low density, the training criterions of VAEs and GANs will make the latent space densely covered. Consequently points that are separated by low-density regions in observation space will be pushed together in latent space, making stationary distances poor proxies for similarity. We transfer ideas from Riemannian geometry to this setting, letting the distance between two points be the shortest path on a Riemannian manifold induced by the transformation. The method yields a principled distance measure, provides a tool for visual inspection of deep generative models, and an alternative to linear interpolation in latent space. In addition, it can be applied for robot movement generalization using previously learned skills. The method is evaluated on a synthetic dataset with known ground truth; on a simulated robot arm dataset; on human motion capture data; and on a generative model of handwritten digits.

artificial intelligence, latent space, machine learning, (18 more...)

arXiv.org Machine Learning

1711.01204

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Requests for Research 2.0

#artificialintelligenceFeb-1-2018, 21:27:04 GMT

If you're not sure where to begin, here are some solved starter problems. Train an LSTM to solve the XOR problem: that is, given a sequence of bits, determine its parity. The LSTM should consume the sequence, one bit at a time, and then output the correct answer at the sequence's end.

artificial intelligence, machine learning, transformer, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Latent Space Oddity: on the Curvature of Deep Generative Models

Arvanitidis, Georgios, Hansen, Lars Kai, Hauberg, Søren

arXiv.org Machine LearningJan-31-2018

Deep generative models provide a systematic way to learn nonlinear data distributions through a set of latent variables and a nonlinear "generator" function that maps latent points into the input space. The nonlinearity of the generator implies that the latent space gives a distorted view of the input space. Under mild conditions, we show that this distortion can be characterized by a stochastic Riemannian metric, and we demonstrate that distances and interpolants are significantly improved under this metric. This in turn improves probability distributions, sampling algorithms and clustering in the latent space. Our geometric analysis further reveals that current generators provide poor variance estimates and we propose a new generator architecture with vastly improved variance estimates. Results are demonstrated on convolutional and fully connected variational autoencoders, but the formalism easily generalizes to other deep generative models.

artificial intelligence, latent space, machine learning, (18 more...)

arXiv.org Machine Learning

1710.11379

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)

Add feedback

OpenAI masters scale with Kubernetes on Microsoft Azure

#artificialintelligenceJan-23-2018, 08:07:03 GMT

OpenAI's mission is to build safe artificial general intelligence (AGI) and ensure AGI's benefits are as widely and evenly distributed as possible. As a non-profit AI research company, they focus on long-term research, working on problems that require fundamental advances in AI capabilities. OpenAI runs Kubernetes for their deep learning research because Kubernetes can provide a fast iteration cycle, scalability, and a lack of boilerplate, which makes it ideal for most of OpenAI's experiments. They currently operate several Kubernetes clusters (some in the cloud and some on physical hardware), the largest of which they pushed to over 2,500 nodes. Their Kubernetes cluster runs in Azure on a combination of D15v2 and NC24 VMs.

large language model, machine learning, natural language, (7 more...)

#artificialintelligence

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Scaling Kubernetes to 2,500 Nodes

#artificialintelligenceJan-20-2018, 08:48:20 GMT

We've been running Kubernetes for deep learning research for over two years. While our largest-scale workloads manage bare cloud VMs directly, Kubernetes provides a fast iteration cycle, reasonable scalability, and a lack of boilerplate which makes it ideal for most of our experiments. We now operate several Kubernetes clusters (some in the cloud and some on physical hardware), the largest of which we've pushed to over 2,500 nodes. This cluster runs in Azure on a combination of D15v2 and NC24 VMs. On the path to this scale, many system components caused breakages, including etcd, the Kube masters, Docker image pulls, network, KubeDNS, and even our machines' ARP caches.

machine learning, natural language, node, (20 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Add feedback

The Data-Driven Weekly #1.6

@machinelearnbotJan-20-2018, 07:18:03 GMT

Right on cue, this past week heralded in an announcement of OpenAI, a new non-profit started by a number of tech luminaries to spearhead AI research that is publicly accessible. The motivation is that apparently these scions of capitalism lose faith in Adam Smith's invisible hand when it comes to AI R&D. Musk continues to promote the idea that AI will be humanity's largest existential threat. Challenging this view, the HBR asks if "OpenAI [is] Solving the Wrong Problem", pointing to the implied lack of trust in capitalism. This is similar to my own parry: that the biggest existential threat to humanity is humanity.

large language model, machine learning, natural language, (16 more...)

@machinelearnbot

Country: North America (0.16)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

uber-common/deep-neuroevolution

@machinelearnbotJan-17-2018, 21:15:16 GMT

Our code is based off of code from OpenAI, who we thank. The original code and related paper from OpenAI can be found here. The repo has been modified to run both ES and our algorithms, including our Deep Genetic Algorithm (DeepGA) locally and on AWS. Note: The Humanoid experiment depends on Mujoco. If you plan to use the mujoco env, make sure to follow mujoco-py's readme about how to install mujoco correctly The extra folder holds the XML specification file for the Humanoid Locomotion with Deceptive Trap domain used in https://arxiv.org/abs/1712.06560.

large language model, machine learning, natural language, (5 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback