标准：零售环境的合成数据集

论文标题

标准：零售环境的合成数据集

StandardSim: A Synthetic Dataset For Retail Environments

论文作者

Mata, Cristina, Locascio, Nick, Sheikh, Mohammed Azeem, Kihara, Kenny, Fischetti, Dan

论文摘要

自主结帐系统依靠视觉和感官输入来在零售环境中进行细粒度的了解。与典型的室内场景相比，零售环境提出了独特的挑战，这是由于众多的密集，独特但相似的物体。当仅可用RGB输入时，问题就变得更加困难，尤其是对于渴望数据的任务，例如实例分割。为了解决零售数据集缺乏数据集，我们提供标准符，这是一种大规模的逼真的合成数据集，其中包含用于语义细分，实例细分，深度估计和对象检测的注释。我们的数据集每个场景提供多个视图，从而实现多视图表示学习。此外，我们介绍了一个新的任务，以自主结帐为中心，称为变更检测，需要随着时间的推移对象的摄入，投票和转移的像素级分类。我们基于广泛使用的模型进行分割和深度估计，这表明我们的测试集与当前较小规模的数据集相比构成了困难的基准，并且我们的培训集为自动结帐任务提供了重要的信息。

Autonomous checkout systems rely on visual and sensory inputs to carry out fine-grained scene understanding in retail environments. Retail environments present unique challenges compared to typical indoor scenes owing to the vast number of densely packed, unique yet similar objects. The problem becomes even more difficult when only RGB input is available, especially for data-hungry tasks such as instance segmentation. To address the lack of datasets for retail, we present StandardSim, a large-scale photorealistic synthetic dataset featuring annotations for semantic segmentation, instance segmentation, depth estimation, and object detection. Our dataset provides multiple views per scene, enabling multi-view representation learning. Further, we introduce a novel task central to autonomous checkout called change detection, requiring pixel-level classification of takes, puts and shifts in objects over time. We benchmark widely-used models for segmentation and depth estimation on our dataset, show that our test set constitutes a difficult benchmark compared to current smaller-scale datasets and that our training set provides models with crucial information for autonomous checkout tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题