联合主义者：多个仪器转录及其应用的联合学习

论文标题

联合主义者：多个仪器转录及其应用的联合学习

Jointist: Joint Learning for Multi-instrument Transcription and Its Applications

论文作者

Cheuk, Kin Wai, Choi, Keunwoo, Kong, Qiuqiang, Li, Bochen, Won, Minz, Hung, Amy, Wang, Ju-Chiang, Herremans, Dorien

论文摘要

在本文中，我们介绍了联合主义者，这是一种乐器感知的多仪器框架，能够转录，识别和将多种乐器与音频剪辑分开。联合主义者由调节其他模块的仪器识别模块组成：输出仪器特异性钢琴卷的转录模块，以及利用仪器信息和转录结果的源分离模块。该仪器条件设计用于明确的多仪器功能，而转录和源分离模块之间的连接是为了更好地转录性能。我们具有挑战性的问题表述使该模型在现实世界中非常有用，因为现代流行音乐通常由多种乐器组成。但是，它的新颖性需要关于如何评估这种模型的新观点。在实验中，我们从各个方面评估了模型，为多仪器转录提供了新的评估观点。我们还认为，转录模型可以用作其他音乐分析任务的预处理模块。在几个下游任务的实验中，我们的转录模型提供的符号表示有助于求解降低检测，和弦识别和关键估计。

In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of the instrument recognition module that conditions the other modules: the transcription module that outputs instrument-specific piano rolls, and the source separation module that utilizes instrument information and transcription results. The instrument conditioning is designed for an explicit multi-instrument functionality while the connection between the transcription and source separation modules is for better transcription performance. Our challenging problem formulation makes the model highly useful in the real world given that modern popular music typically consists of multiple instruments. However, its novelty necessitates a new perspective on how to evaluate such a model. During the experiment, we assess the model from various aspects, providing a new evaluation perspective for multi-instrument transcription. We also argue that transcription models can be utilized as a preprocessing module for other music analysis tasks. In the experiment on several downstream tasks, the symbolic representation provided by our transcription model turned out to be helpful to spectrograms in solving downbeat detection, chord recognition, and key estimation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题