论文标题

安全的Hosmer-Lemeshow测试

A safe Hosmer-Lemeshow test

论文作者

Henzi, Alexander, Puke, Marius, Dimitriadis, Timo, Ziegel, Johanna

论文摘要

本文提出了Hosmer-Lemeshow(HL)测试的替代方法,用于评估二进制事件概率预测的校准。该方法基于电子价值,这是一种用于假设检验的新工具。电子价值是一个随机变量,其期望值较小或等于零假设下一个。大型电子价值给出了反对零假设的证据,而电子价值的乘法倒数为p值。我们的测试使用在线等渗回归来估计校准曲线是针对无原假设的“投注策略”。我们表明,该测试具有基本上所有替代方案的力量,这在理论上使其优于HL测试,同时解决了后者的众所周知的不稳定性问题。 A simulation study shows that a feasible version of the proposed eHL test can detect slight miscalibrations in practically relevant sample sizes, but trades its universal validity and power guarantees against a reduced empirical power compared to the HL test in a classical simulation setup.We illustrate our test on recalibrated predictions for credit card defaults during the Taiwan credit card crisis, where the classical HL test delivers equivocal results.

This article proposes an alternative to the Hosmer-Lemeshow (HL) test for evaluating the calibration of probability forecasts for binary events. The approach is based on e-values, a new tool for hypothesis testing. An e-value is a random variable with expected value less or equal to one under a null hypothesis. Large e-values give evidence against the null hypothesis, and the multiplicative inverse of an e-value is a p-value. Our test uses online isotonic regression to estimate the calibration curve as a `betting strategy' against the null hypothesis. We show that the test has power against essentially all alternatives, which makes it theoretically superior to the HL test and at the same time resolves the well-known instability problem of the latter. A simulation study shows that a feasible version of the proposed eHL test can detect slight miscalibrations in practically relevant sample sizes, but trades its universal validity and power guarantees against a reduced empirical power compared to the HL test in a classical simulation setup.We illustrate our test on recalibrated predictions for credit card defaults during the Taiwan credit card crisis, where the classical HL test delivers equivocal results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源