论文标题
SDW-ASL:一个动态系统,用于生成连续美国手语的大型数据集
SDW-ASL: A Dynamic System to Generate Large Scale Dataset for Continuous American Sign Language
论文作者
论文摘要
尽管近年来使用深度学习技术在自然语言处理方面取得了巨大进展,但手语的生产和理解力却很少。一个关键的障碍是由于标签数据生成的不足成本难以忍受,因此缺乏可用于公众的大刻度数据集。为美国手语(ASL)理解提供公共数据的努力产生了两个数据集,其中包括超过数千个视频片段。这些数据集足够大,可以使对符号语言进行深入学习的有意义的开端,但太小了,无法导致任何可以实际部署的解决方案。到目前为止,仍然没有适合ASL生产的数据集。我们提出了一个可以为连续ASL生成大规模ASL数据集的系统。它适用于一般ASL处理,对于ASL生产特别有用。连续的ASL数据集包含凝结的身体姿势数据格式的英语标记的人类发音。为了更好地为研究社区服务,我们正在发布我们的ASL数据集的第一个版本,其中包含30k句子,416k单词,一个18k单词的词汇,总计104小时。这是迄今为止在视频持续时间内发布的最大的连续手语数据集。我们还描述了一个可以发展和扩展数据集的系统,以合并更好的数据处理技术和更多内容。我们希望将此ASL数据集的发布和可持续数据集生成系统发布给公众,将推动ASL自然语言处理中更好的深度学习研究。
Despite tremendous progress in natural language processing using deep learning techniques in recent years, sign language production and comprehension has advanced very little. One critical barrier is the lack of largescale datasets available to the public due to the unbearable cost of labeled data generation. Efforts to provide public data for American Sign Language (ASL) comprehension have yielded two datasets, comprising more than thousand video clips. These datasets are large enough to enable a meaningful start to deep learning research on sign languages but are far too small to lead to any solution that can be practically deployed. So far, there is still no suitable dataset for ASL production. We proposed a system that can generate large scale ASL datasets for continuous ASL. It is suitable for general ASL processing and is particularly useful for ASL production. The continuous ASL dataset contains English labeled human articulations in condensed body pose data formats. To better serve the research community, we are releasing the first version of our ASL dataset, which contains 30k sentences, 416k words, a vocabulary of 18k words, in a total of 104 hours. This is the largest continuous sign language dataset published to date in terms of video duration. We also describe a system that can evolve and expand the dataset to incorporate better data processing techniques and more contents when available. It is our hope that the release of this ASL dataset and the sustainable dataset generation system to the public will propel better deep-learning research in ASL natural language processing.