NISP：用于扬声器分析的多语言多重化数据集

论文标题

NISP：用于扬声器分析的多语言多重化数据集

NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

论文作者

Kalluri, Shareef Babu, Vijayasenan, Deepu, Ganapathy, Sriram, M, Ragesh Rajan, Krishnan, Prashant

论文摘要

许多语音的商业和法医应用都要求提取有关说话者特征的信息，这属于说话者分析的广泛类别。分析所需的说话者特征包括说话者的物理特征，例如说话者的身高，年龄和性别以及说话者的母语。许多可用的数据集只有用于扬声器分析的部分信息。在本文中，我们试图通过开发一个新的数据集来克服这一局限性，该数据集在英语和英语中具有来自五种不同印度语言的语音数据。还收集了用于说话者的元数据信息，例如语言信息，区域信息和说话者的身体特征。我们将此数据集称为NITK-IISC多语言多字体扬声器分析（NISP）数据集。本文提供了该数据集上扬声器分析的数据集，潜在应用程序和基线结果的描述。

Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have only partial information for speaker profiling. In this paper, we attempt to overcome this limitation by developing a new dataset which has speech data from five different Indian languages along with English. The metadata information for speaker profiling applications like linguistic information, regional information, and physical characteristics of a speaker are also collected. We call this dataset as NITK-IISc Multilingual Multi-accent Speaker Profiling (NISP) dataset. The description of the dataset, potential applications, and baseline results for speaker profiling on this dataset are provided in this paper.

下载PDF全文

下载文献需遵守相关版权规定

论文标题