MICAUGMENT：单发麦克风风格转移

论文标题

MICAUGMENT：单发麦克风风格转移

MicAugment: One-shot Microphone Style Transfer

论文作者

Borsos, Zalán, Li, Yunpeng, Gfeller, Beat, Tagliasacchi, Marco

论文摘要

成功部署基于音频的模型“野外”的关键方面是对异质获取条件引入的转换的鲁棒性。在这项工作中，我们提出了一种执行一次性麦克风风格转移的方法。鉴于目标设备仅记录了几秒钟的音频，Micaugment识别了与输入采集管道相关的转换，并使用学习的转换来合成音频，就好像在与目标音频相同的条件下记录了音频。我们表明，我们的方法可以成功地将样式传输应用于真实音频，并且在下游任务中用作数据增强时，它会大大提高模型鲁棒性。

A crucial aspect for the successful deployment of audio-based models "in-the-wild" is the robustness to the transformations introduced by heterogeneous acquisition conditions. In this work, we propose a method to perform one-shot microphone style transfer. Given only a few seconds of audio recorded by a target device, MicAugment identifies the transformations associated to the input acquisition pipeline and uses the learned transformations to synthesize audio as if it were recorded under the same conditions as the target audio. We show that our method can successfully apply the style transfer to real audio and that it significantly increases model robustness when used as data augmentation in the downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题