Controlling High-Dimensional Data With Sparse Input

Iliescu, Dan Andrei, Mohan, Devang Savita Ram, Teh, Tian Huey, Hodari, Zack

Mar-14-2023–arXiv.org Artificial Intelligence

We address the problem of human-in-the-loop control for generating highly-structured data. This task is challenging because existing generative models lack an efficient interface through which users can modify the output. Users have the option to either manually explore a non-interpretable latent space, or to laboriously annotate the data with conditioning labels. To solve this, we introduce a novel framework whereby an encoder maps a sparse, human interpretable control space onto the latent space of a generative model. We apply this framework to the task of controlling prosody in text-to-speech synthesis. We propose a model, called Multiple-Instance CVAE (MICVAE), that is specifically designed to encode sparse prosodic features and output complete waveforms. We show empirically that MICVAE displays desirable qualities of a sparse human-in-the-loop control mechanism: efficiency, robustness, and faithfulness. With even a very small number of input values (~4), MICVAE enables users to improve the quality of the output significantly, in terms of listener preference (4:1).

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Mar-14-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Louisiana
    - Orleans Parish > New Orleans (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - Ontario > Toronto (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Denmark > North Jutland
    - Aalborg (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
  - Austria
    - Vienna (0.04)
    - Styria > Graz (0.04)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Speech (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found