MultiTrack Music Transformer

论文标题

MultiTrack Music Transformer

Multitrack Music Transformer

论文作者

Dong, Hao-Wen, Chen, Ke, Dubnov, Shlomo, McAuley, Julian, Berg-Kirkpatrick, Taylor

论文摘要

使用变压器模型生成多轨音乐的现有方法在仪器的数量，音乐段的长度和缓慢的推理方面受到限制。这部分是由于现有表示形式需要的冗长输入序列的内存要求。在这项工作中，我们提出了一种新的多轨音乐表示形式，该表示可以允许多种仪器，同时保持短序列。我们拟议的MultiTrack音乐变压器（MMT）与最先进的系统实现了可比的性能，在主观听力测试中降落在两个最近提出的模型之间，同时实现了大量的加速和记忆减少，这使得该方法对实时即兴创作或接近实时创造性应用程序有吸引力。此外，我们提出了一项新的措施，以分析音乐自我注意事项，并表明训练有素的模型更多地参与了与当前音符形成辅音间隔的注释，并从当前的步骤中脱颖而出。

Existing approaches for generating multitrack music with transformer models have been limited in terms of the number of instruments, the length of the music segments and slow inference. This is partly due to the memory requirements of the lengthy input sequences necessitated by existing representations. In this work, we propose a new multitrack music representation that allows a diverse set of instruments while keeping a short sequence length. Our proposed Multitrack Music Transformer (MMT) achieves comparable performance with state-of-the-art systems, landing in between two recently proposed models in a subjective listening test, while achieving substantial speedups and memory reductions over both, making the method attractive for real time improvisation or near real time creative applications. Further, we propose a new measure for analyzing musical self-attention and show that the trained model attends more to notes that form a consonant interval with the current note and to notes that are 4N beats away from the current step.

下载PDF全文

下载文献需遵守相关版权规定

论文标题