论文标题
关于随着概念漂移的学习界限和学习损失的变化
On the Change of Decision Boundaries and Loss in Learning with Concept Drift
论文作者
论文摘要
概念漂移的概念是指产生观察到数据随时间变化的分布发生变化的现象。如果存在漂移,机器学习模型可能会变得不准确并需要调整。许多用于漂移学习的技术依赖于交织的测试训练误差(ITTE)作为近似模型泛化误差的数量,并触发了漂移检测和模型更新。在这项工作中,我们研究了该过程在数学上是合理的。更确切地说,我们将ITTE的变化与实际漂移的存在联系起来,即后方变化,并在最优性的假设下改变训练结果。我们通过几种学习算法,模型和数据集的经验证据来支持我们的理论发现。
The notion of concept drift refers to the phenomenon that the distribution generating the observed data changes over time. If drift is present, machine learning models may become inaccurate and need adjustment. Many technologies for learning with drift rely on the interleaved test-train error (ITTE) as a quantity which approximates the model generalization error and triggers drift detection and model updates. In this work, we investigate in how far this procedure is mathematically justified. More precisely, we relate a change of the ITTE to the presence of real drift, i.e., a changed posterior, and to a change of the training result under the assumption of optimality. We support our theoretical findings by empirical evidence for several learning algorithms, models, and datasets.