论文标题

使用变形金刚的情感产生音乐

Generating music with sentiment using Transformer-GANs

论文作者

Neves, Pedro, Fornari, Jose, Florindo, João

论文摘要

由于深度学习的出现,自动音乐的领域取得了重大进展。但是,大多数这些结果都是由无条件模型产生的,这些模型缺乏与用户互动的能力,不允许他们以有意义且实用的方式指导生成过程。此外,综合音乐在较长的时间范围内保持连贯的音乐,同时仍在捕获使它听起来``现实''或``人类''的当地方面仍然具有挑战性。这是由于使用长序列数据所需的庞大计算要求,也是由于经常使用的培训方案所施加的限制。在本文中,我们提出了一种符号音乐的生成模型,该模型由从人类情感中获取的数据调节。该模型是一种变压器,训练有素,其标签对应于定量代表人类情感状态的价和唤醒维度的不同构型。我们尝试通过采用有效的线性注意力来解决上述问题,并使用歧视器作为提高生成音乐的整体质量及其遵循条件信号的能力的工具。

The field of Automatic Music Generation has seen significant progress thanks to the advent of Deep Learning. However, most of these results have been produced by unconditional models, which lack the ability to interact with their users, not allowing them to guide the generative process in meaningful and practical ways. Moreover, synthesizing music that remains coherent across longer timescales while still capturing the local aspects that make it sound ``realistic'' or ``human-like'' is still challenging. This is due to the large computational requirements needed to work with long sequences of data, and also to limitations imposed by the training schemes that are often employed. In this paper, we propose a generative model of symbolic music conditioned by data retrieved from human sentiment. The model is a Transformer-GAN trained with labels that correspond to different configurations of the valence and arousal dimensions that quantitatively represent human affective states. We try to tackle both of the problems above by employing an efficient linear version of Attention and using a Discriminator both as a tool to improve the overall quality of the generated music and its ability to follow the conditioning signals.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源