论文标题

小型数据集的更好的多类概率估计值

Better Multi-class Probability Estimates for Small Data Sets

论文作者

Alasalmi, Tuomo, Suutala, Jaakko, Koskimäki, Heli, Röning, Juha

论文摘要

除了良好的类别分离外,许多分类应用程序还需要准确的概率估计值,但通常设计的分类器仅针对后者。校准是通过后处理来改善概率估计的过程,但常用的校准算法在小型数据集上效果不佳,并假设分类任务是二进制的。这两个限制都限制了其现实世界的适用性。先前引入的数据生成和分组算法减轻了小型数据集提出的问题,在本文中,我们将证明其在多类问题上的应用也可以解决其他限制。我们的实验表明,可以使用建议的方法减少校准误差,并且可以接受额外的计算成本。

Many classification applications require accurate probability estimates in addition to good class separation but often classifiers are designed focusing only on the latter. Calibration is the process of improving probability estimates by post-processing but commonly used calibration algorithms work poorly on small data sets and assume the classification task to be binary. Both of these restrictions limit their real-world applicability. Previously introduced Data Generation and Grouping algorithm alleviates the problem posed by small data sets and in this article, we will demonstrate that its application to multi-class problems is also possible which solves the other limitation. Our experiments show that calibration error can be decreased using the proposed approach and the additional computational cost is acceptable.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源