论文标题

看不见的物种估计量的收敛性

Convergence of Chao Unseen Species Estimator

论文作者

Rajaraman, Nived, Chandra, Prafulla, Thangaraj, Andrew, Suresh, Ananda Theertha

论文摘要

支持尺寸估计以及看不见的物种估计的相关问题在生态学和数据库分析中广泛应用。也许最常用的支持尺寸估计器是ChAO估计器。尽管使用了广泛的利用,但对其理论特性知之甚少。我们分析了ChAO估计器,并表明其最坏情况的均方根误差(MSE)小于插件估算器的MSE $ \ MATHCAL {O}(O}((K/N)^4)$,其中$ K $是最大支持大小,$ n $是样品的数量。我们的主要技术贡献是一种新方法,用于分析离散分布属性的合理估计器,这可能具有独立的兴趣。

Support size estimation and the related problem of unseen species estimation have wide applications in ecology and database analysis. Perhaps the most used support size estimator is the Chao estimator. Despite its wide spread use, little is known about its theoretical properties. We analyze the Chao estimator and show that its worst case mean squared error (MSE) is smaller than the MSE of the plug-in estimator by a factor of $\mathcal{O} ((k/n)^4)$, where $k$ is the maximum support size and $n$ is the number of samples. Our main technical contribution is a new method to analyze rational estimators for discrete distribution properties, which may be of independent interest.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源