G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

Wang, Gary, Cubuk, Ekin D., Rosenberg, Andrew, Cheng, Shuyang, Weiss, Ron J., Ramabhadran, Bhuvana, Moreno, Pedro J., Le, Quoc V., Park, Daniel S.

Oct-24-2022–arXiv.org Artificial Intelligence

For example, in [16], the authors discovered become automated and more "end-to-end," the data augmentation that SpecAugment [11] did not compose well with multi-style policy (what augmentation functions to use, and how training augmentation [4, 17], and found that they needed to to apply them) remains hand-crafted. We present G(raph)- ensemble the augmentations to benefit from both. Augment, a technique to define the augmentation space as In this work, we address this problem by a scheme we refer directed acyclic graphs (DAGs) and search over this space to as G(raph)-Augment, where a stochastic augmentation to optimize the augmentation policy itself. We show that policy is parameterized by a directed acyclic graph (DAG) given the same computational budget, policies produced by whose edges are labeled by sampling probabilities and augmentation G-Augment are able to perform better than SpecAugment parameters. By simultaneously searching for the policies obtained by random search on fine-tuning tasks on graph structure and the parameters that label the graph, we CHiME-6 and AMI. G-Augment is also able to establish are able to optimize not only the augmentation parameters of a new state-of-the-art ASR performance on the CHiME-6 the individual augmentations, but how those augmentations evaluation set (30.7% WER). We further demonstrate that are being applied. We utilize 17 ASR augmentations in our G-Augment policies show better transfer properties across search space, details of which can be found in section 3.3.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-24-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Jersey > Middlesex County > Piscataway (0.04)
- Asia > Middle East
  - Qatar > Ad-Dawhah > Doha (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Search (1.00)
  - Speech > Speech Recognition (0.69)
  - Machine Learning
    - Evolutionary Systems (1.00)
    - Neural Networks > Deep Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found