声音事件分类：考虑课程优先考虑的声音事件

论文标题

声音事件分类：考虑课程优先考虑的声音事件

Sound Event Triage: Detecting Sound Events Considering Priority of Classes

论文作者

Tonami, Noriyuki, Imoto, Keisuke

论文摘要

我们为声音事件检测（SED）提出了一项新任务：Sound Event Triage（SET）。设定的目的是检测任意数量的高优先级事件类，同时允许对每个事件类给予优先级的低优先级事件类别。在针对特定声音事件类别的常规SED方法中，只能优先考虑单个事件类。此外，优先级是不可调节的，即传统方法只能使用目标事件类的类型，例如单热量向量，作为输入。为了灵活地控制目标事件的大量信息，所提出的集合利用不仅可以采用目标声音的类型，还可以优先检测到每个目标声音的程度。为了实施优先级的事件检测，我们提出了类加权培训，其中损失功能和网络由每个类的优先级参数随机加权。由于这是第一篇论文，因此我们特别介绍了单个目标集的实现，这是集合的子任务。使用Urban-SED数据集实验的结果表明，单个目标的提议方法设定的方法在8.70、6.66中优于常规SED方法，而``air_conditioner，''``````air_conditioner）'''```'''car_horn''''''''''''''和``sreet_music''''''''和`对于类别的平均得分，与常规的SED和其他目标类条件模型相比，提出的方法将基于交叉的F得分提高了3.37个百分点。

We propose a new task for sound event detection (SED): sound event triage (SET). The goal of SET is to detect an arbitrary number of high-priority event classes while allowing misdetections of low-priority event classes where the priority is given for each event class. In conventional methods of SED for targeting a specific sound event class, it is only possible to give priority to a single event class. Moreover, the level of priority is not adjustable, i.e, the conventional methods can use only types of target event class such as one-hot vector, as inputs. To flexibly control much information on the target event, the proposed SET exploits not only types of target sound but also the extent to which each target sound is detected with priority. To implement the detection of events with priority, we propose class-weighted training, in which loss functions and the network are stochastically weighted by the priority parameter of each class. As this is the first paper on SET, we particularly introduce an implementation of single target SET, which is a subtask of SET. Results of the experiments using the URBAN-SED dataset show that the proposed method of single target SET outperforms the conventional SED method by 8.70, 6.66, and 6.09 percentage points for ``air_conditioner,'' ``car_horn,'' and ``street_music,'' respectively, in terms of the intersection-based F-score. For the average score of classes, the proposed methods increase the intersection-based F-score by up to 3.37 percentage points compared with the conventional SED and other target-class-conditioned models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题