从合成数据中学习深层模型，以提取海豚哨子轮廓

论文标题

从合成数据中学习深层模型，以提取海豚哨子轮廓

Learning Deep Models from Synthetic Data for Extracting Dolphin Whistle Contours

论文作者

Li, Pu, Liua, Xiaobai, Palmer, K. J., Fleishman, Erica, Gillespie, Douglas, Nosal, Eva-Marie, Shiu, Yu, Klinck, Holger, Cholewiak, Danielle, Helble, Tyler, Roch, Marie A.

论文摘要

我们提出了一种基于学习的方法，用于在水文记录中提取齿鲸（Odontoceti）的哨子。我们的方法表示音频信号作为时频谱图，并将每个光谱图分解为一组时频贴片。深度神经网络从频谱图斑块中学习了原型模式（例如，交叉，频率调制扫描），并预测与哨声相关的时频峰。我们还开发了一种综合方法，可以从背景环境中综合培训样本，并以最少的人类注释工作来培训网络。我们将所提出的从合成方法应用于公共检测，分类，定位和密度估计（DCLDE）2011年研讨会数据的子集中，以提取哨声置信图，然后我们使用现有的轮廓提取器处理以产生哨声注释。当应用于普通海豚（Delphinus spp。）和瓶子糖海豚（Tursiops Truncatus）吹声时，我们最佳合成方法的F1得分比我们的基线哨声提取算法（提高约25％）要大0.158。

We present a learning-based method for extracting whistles of toothed whales (Odontoceti) in hydrophone recordings. Our method represents audio signals as time-frequency spectrograms and decomposes each spectrogram into a set of time-frequency patches. A deep neural network learns archetypical patterns (e.g., crossings, frequency modulated sweeps) from the spectrogram patches and predicts time-frequency peaks that are associated with whistles. We also developed a comprehensive method to synthesize training samples from background environments and train the network with minimal human annotation effort. We applied the proposed learn-from-synthesis method to a subset of the public Detection, Classification, Localization, and Density Estimation (DCLDE) 2011 workshop data to extract whistle confidence maps, which we then processed with an existing contour extractor to produce whistle annotations. The F1-score of our best synthesis method was 0.158 greater than our baseline whistle extraction algorithm (~25% improvement) when applied to common dolphin (Delphinus spp.) and bottlenose dolphin (Tursiops truncatus) whistles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题