论文标题

关于线性回归的子样本选择中心选择的注释

A note on centering in subsample selection for linear regression

论文作者

Wang, HaiYing

论文摘要

中心是线性回归分析中常用的技术。借助响应和协变量的集中数据,可以从模型中计算出斜率参数的普通最小二乘估计器,而无需截距。如果从中心的完整数据中选择子样本,则通常以子样本为中心。在这种情况下,在没有拦截的情况下适合模型仍然合适?答案是肯定的,我们表明,在没有截距的模型中获得的斜率参数上的最小二乘估计器是没有偏见的,并且与从截距的模型中获得的差异方差协方差矩阵相比,它具有较小的方差协方差矩阵。我们进一步表明,对于非信息加权亚采样时,使用加权最小二乘估计器时,使用完整的数据加权手段来重新放置子样本可提高估计效率。

Centering is a commonly used technique in linear regression analysis. With centered data on both the responses and covariates, the ordinary least squares estimator of the slope parameter can be calculated from a model without the intercept. If a subsample is selected from a centered full data, the subsample is typically un-centered. In this case, is it still appropriate to fit a model without the intercept? The answer is yes, and we show that the least squares estimator on the slope parameter obtained from a model without the intercept is unbiased and it has a smaller variance covariance matrix in the Loewner order than that obtained from a model with the intercept. We further show that for noninformative weighted subsampling when a weighted least squares estimator is used, using the full data weighted means to relocate the subsample improves the estimation efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源