Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
Guimaraes, Gabriel Lima, Sanchez-Lengeling, Benjamin, Outeiral, Carlos, Farias, Pedro Luis Cunha, Aspuru-Guzik, Alán
In sequence-based generative models, besides the generation of samples likely to have been drawn from a data distribution, it is often desirable to finetune the samples towards some domain-specific metrics. This work proposes a method to guide the structure and quality of samples utilizing a combination of adversarial training and expert-based rewards with reinforcement learning. Building on SeqGAN, a sequence based Generative Adversarial Network (GAN) framework modeling the data generator as a stochastic policy in a reinforcement learning setting, we extend the training process to include domain-specific objectives additional to the discriminator reward. The mixture of both types of rewards can be controlled via a tune-able parameter. To improve training stability we utilize the Wasserstein distance as loss function for the discriminator. We demonstrate the effectiveness of this approach in two tasks: generation of molecules encoded as text sequences and musical melodies. The experimental results demonstrate the models can generate samples which maintain information originally learned from data, retain sample diversity, and show improvement in the desired metrics.
Feb-6-2018
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
- Technology: