A Proposal for Foley Sound Synthesis Challenge

Choi, Keunwoo, Oh, Sangshin, Kang, Minsung, McFee, Brian

Jul-21-2022–arXiv.org Artificial Intelligence

We during post-production to enhance its perceived acoustic properties, review recent machine learning challenges in audio, speech, and e.g., by simulating the sounds of footsteps, ambient environmental music research in Section 2 and existing works and datasets in Section sounds, or visible objects on the screen. While foley is traditionally 3. In Section 4, we provide a proposal for foley sound synthesis produced by foley artists, there is increasing interest in automatic challenge that includes problem definition, datasets, and evaluation or machine-assisted techniques building upon recent advances in metrics. We conclude the paper in Section 5. sound synthesis and generative models. To foster more participation in this growing research area, we propose a challenge for automatic 2. CASE STUDY: RESEARCH CHALLENGES foley synthesis. Through case studies on successful previous challenges in audio and machine learning, we set the goals of In this section, we review five existing research challenges: Blizzard the proposed challenge: rigorous, unified, and efficient evaluation Challenge, CHiME, DCASE, Music Demixing challenge, and of different foley synthesis systems, with an overarching goal of AI Song Contest. The former three are relatively mature while the drawing active participation from the research community. We outline latter two started after 2020. All of them started along with the increasing the details and design considerations of a foley sound synthesis popularity of the research problems and have contributed challenge, including task definition, dataset requirements, and evaluation to the continued growth by defining the tasks, providing common criteria.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

Jul-21-2022

arXiv.org PDF

Add feedback

Genre:
- Overview (0.48)
- Research Report (0.50)

Industry:
- Leisure & Entertainment (0.69)
- Media > Music (0.54)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language (0.91)
  - Speech (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found