论文标题

LSST时代的实时增值数据增强

Real-Time Value-Driven Data Augmentation in the Era of LSST

论文作者

Sravan, Niharika, Milisavljevic, Dan, Reynolds, Jack M., Lentner, Geoffrey, Linvill, Mark

论文摘要

时间域调查中的大量数据正在使传统的人类指导数据收集和推理技术不切实际。我们提出了一种新的方法,用于在大规模的大规模调查时代进行科学推断的数据收集,该调查时代使用基于价值的指标实时制定和协调随访。我们证明了针对智能瞬态跟踪(refitt)推荐引擎中的基本原理,即来自数据经纪人的调查和增值输入的实时警报,以预测鉴于一系列科学目标的瞬态和设计最佳数据增强策略的未来行为。本文中提出的原型测试了对时空和时间(LSST)核心偏曲超新星(CC SN)灯曲线的模拟鲁宾天文台遗产调查的作品。在初始开发阶段选择了CC SNE,因为众所周知,它们很难对其进行分类,并期望任何对它们的学习技术至少应该对其他瞬态有效。我们在一个随机的LSST Night中演示了recritt的行为。该系统对事件的光度行为做出了良好的预测,并使用它们使用简单的数据驱动度量来计划后续行动。我们认为,机器定向的后续措施通过减少数据收集的停机时间和偏见来最大程度地调查和后续资源的科学潜力。

The deluge of data from time-domain surveys is rendering traditional human-guided data collection and inference techniques impractical. We propose a novel approach for conducting data collection for science inference in the era of massive large-scale surveys that uses value-based metrics to autonomously strategize and co-ordinate follow-up in real-time. We demonstrate the underlying principles in the Recommender Engine For Intelligent Transient Tracking (REFITT) that ingests live alerts from surveys and value-added inputs from data brokers to predict the future behavior of transients and design optimal data augmentation strategies given a set of scientific objectives. The prototype presented in this paper is tested to work given simulated Rubin Observatory Legacy Survey of Space and Time (LSST) core-collapse supernova (CC SN) light-curves from the PLAsTiCC dataset. CC SNe were selected for the initial development phase as they are known to be difficult to classify, with the expectation that any learning techniques for them should be at least as effective for other transients. We demonstrate the behavior of REFITT on a random LSST night given ~32000 live CC SNe of interest. The system makes good predictions for the photometric behavior of the events and uses them to plan follow-up using a simple data-driven metric. We argue that machine-directed follow-up maximizes the scientific potential of surveys and follow-up resources by reducing downtime and bias in data collection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源