论文标题
时间序列分类的半无监督学习
Semi-unsupervised Learning for Time Series Classification
论文作者
论文摘要
时间序列无处不在,因此固有地很难分析,最终用于标记或群集。随着物联网(IoT)及其智能设备的兴起,数据将大量收集。收集的数据丰富的信息丰富,因为人们可以实时检测事故(例如汽车),或者在给定的时间段内评估伤害/疾病(例如,健康设备)。由于其混乱的性质和大量的数据点,时间表很难手动标记。此外,数据中的新类可能会随着时间的流逝而出现(与手写数字相反),这需要重新标记数据。在本文中,我们提出了SUSL4TS,这是一种用于半无监管学习的深层生成高斯混合模型,以对时间序列数据进行分类。通过我们的方法,我们可以减轻手动标记步骤,因为我们可以检测到稀疏标记的类(半监督)并识别隐藏在数据中的新兴类(无监督)。我们通过来自不同领域的既定时间序列分类数据集证明了方法的功效。
Time series are ubiquitous and therefore inherently hard to analyze and ultimately to label or cluster. With the rise of the Internet of Things (IoT) and its smart devices, data is collected in large amounts any given second. The collected data is rich in information, as one can detect accidents (e.g. cars) in real time, or assess injury/sickness over a given time span (e.g. health devices). Due to its chaotic nature and massive amounts of datapoints, timeseries are hard to label manually. Furthermore new classes within the data could emerge over time (contrary to e.g. handwritten digits), which would require relabeling the data. In this paper we present SuSL4TS, a deep generative Gaussian mixture model for semi-unsupervised learning, to classify time series data. With our approach we can alleviate manual labeling steps, since we can detect sparsely labeled classes (semi-supervised) and identify emerging classes hidden in the data (unsupervised). We demonstrate the efficacy of our approach with established time series classification datasets from different domains.