歌词和声学对情绪的合作理解的贡献

论文标题

歌词和声学对情绪的合作理解的贡献

The Contribution of Lyrics and Acoustics to Collaborative Understanding of Mood

论文作者

Naseri, Shahrzad, Reddy, Sravana, Correia, Joana, Karlgren, Jussi, Jones, Rosie

论文摘要

在这项工作中，我们通过数据驱动分析研究歌曲歌词和情绪之间的关联。我们的数据集由近一百万首歌曲组成，歌曲杂志的关联来自Spotify流媒体平台上的用户播放列表。我们利用基于变压器的最先进的自然语言处理模型来学习歌词和情绪之间的关联。我们发现，在零拍设置中基于变压器的基于验证的语言模型（即，开箱即用，没有进一步的数据培训）非常有力地捕获歌曲 - 莫德关联。此外，我们说明了关于歌曲 - 莫德关联的培训会导致一个高度准确的模型，该模型可预测这些关联的看不见歌曲。此外，通过使用歌词对模型的预测与使用声学特征的预测，我们观察到，与声学相比，歌词对于情绪预测的相对重要性取决于特定的情绪。最后，我们通过注释任务来验证模型是否正在捕获与人类相同的歌词和声学信息，在该任务中，我们根据歌词和声学获得了人类对情绪歌唱相关性的判断。

In this work, we study the association between song lyrics and mood through a data-driven analysis. Our data set consists of nearly one million songs, with song-mood associations derived from user playlists on the Spotify streaming platform. We take advantage of state-of-the-art natural language processing models based on transformers to learn the association between the lyrics and moods. We find that a pretrained transformer-based language model in a zero-shot setting -- i.e., out of the box with no further training on our data -- is powerful for capturing song-mood associations. Moreover, we illustrate that training on song-mood associations results in a highly accurate model that predicts these associations for unseen songs. Furthermore, by comparing the prediction of a model using lyrics with one using acoustic features, we observe that the relative importance of lyrics for mood prediction in comparison with acoustics depends on the specific mood. Finally, we verify if the models are capturing the same information about lyrics and acoustics as humans through an annotation task where we obtain human judgments of mood-song relevance based on lyrics and acoustics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题