带有MIDI增强的基于变压器的音高序列自动编码器

论文标题

带有MIDI增强的基于变压器的音高序列自动编码器

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation

论文作者

Ding, Mingshuo, Ma, Yinghao

论文摘要

尽管最近有深度学习自动音乐生成算法的成就，但很少有人提出过评估单轨音乐摘录是否由自动机或同性恋者组成的方法。为了解决这个问题，我们将基于Albert的蒙版语言模型应用于作曲家分类。目的是获得一个模型，该模型可以表明MIDI夹可能是自动产生假设的条件，并且仅通过AI组成的单轨MIDI进行训练。在本文中，参数的量减少了，提出了两种有关数据增强的方法，以及用于防止过度拟合的精制损耗函数。实验结果表明，我们的模型在CSMT（2020）的数据挑战中的所有$ 7 $团队中排名$ 3^{rd} $。此外，这种鼓舞人心的方法可以传播到基于小数据集的其他音乐信息检索任务。

Despite recent achievements of deep learning automatic music generation algorithms, few approaches have been proposed to evaluate whether a single-track music excerpt is composed by automatons or Homo sapiens. To tackle this problem, we apply a masked language model based on ALBERT for composers classification. The aim is to obtain a model that can suggest the probability a MIDI clip might be composed condition on the auto-generation hypothesis, and which is trained with only AI-composed single-track MIDI. In this paper, the amount of parameters is reduced, two methods on data augmentation are proposed as well as a refined loss function to prevent overfitting. The experiment results show our model ranks $3^{rd}$ in all the $7$ teams in the data challenge in CSMT(2020). Furthermore, this inspiring method could be spread to other music information retrieval tasks that are based on a small dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题