论文标题
基于聚类的聚合用于事件流中的预测
Clustering-based Aggregations for Prediction in Event Streams
论文作者
论文摘要
预测购物者的行为为零售商提供了宝贵的信息,例如购物者的预期支出或超市的总营业额。在个人层面上做出预测的能力很有用,因为它允许超市准确地执行针对性的营销。但是,鉴于购物者的预期数量及其各种行为,很难在个人层面上做出准确的预测。这个问题不仅出现在购物者的行为中,而且在各种业务流程中都出现,例如预测何时支付发票。在本文中,我们提出了Cocies,该框架重点是在在线环境中进行权衡。通过一次对大量实体进行预测,我们提高了预测准确性,但要以潜在的有用性成本进行预测,因为我们可以对单个实体的说法更少。 Capies是在在线环境中开发的,我们不断更新预测模型并随着时间的推移做出新的预测。我们在两个现实世界中的实验评估中显示了权衡的存在:一家超过16万购物者的超市和一个超过171 000张发票的油漆工厂。
Predicting the behaviour of shoppers provides valuable information for retailers, such as the expected spend of a shopper or the total turnover of a supermarket. The ability to make predictions on an individual level is useful, as it allows supermarkets to accurately perform targeted marketing. However, given the expected number of shoppers and their diverse behaviours, making accurate predictions on an individual level is difficult. This problem does not only arise in shopper behaviour, but also in various business processes, such as predicting when an invoice will be paid. In this paper we present CAPiES, a framework that focuses on this trade-off in an online setting. By making predictions on a larger number of entities at a time, we improve the predictive accuracy but at the potential cost of usefulness since we can say less about the individual entities. CAPiES is developed in an online setting, where we continuously update the prediction model and make new predictions over time. We show the existence of the trade-off in an experimental evaluation in two real-world scenarios: a supermarket with over 160 000 shoppers and a paint factory with over 171 000 invoices.