论文标题

时间,天气和Google趋势在理解和预测网络调查回应中的作用

The Role of Time, Weather and Google Trends in Understanding and Predicting Web Survey Response

论文作者

Fang, Qixiang, Burger, Joep, Meijers, Ralph, van Berkel, Kees

论文摘要

在有关Web调查方法的文献中,已经做出了重大努力,以了解时间不变因素(例如性别,教育和婚姻状况)在(非)反应机制中的作用。但是,仅仅时间不变的因素不能说明(非)响应的大多数变化,尤其是随着时间的推移响应率的波动。该观察结果激发了我们调查时间不变因素的对应物,即时变因素以及它们在Web调查(非)响应中所起的潜在作用。具体而言,我们研究了时间,天气和社会趋势(源自Google趋势数据)对2016年和2017年荷兰健康调查的日常(非)反应模式的影响。使用离散的生存分析,我们发现周末,假期,愉快的天气,疾病暴发和恐怖主义显着性与较少的反应有关。此外,我们表明,仅使用这些变量可实现每日和累积响应率的令人满意的预测准确性,而训练有素的模型被应用于未来的看不见的数据。这种方法具有仅需要非个人上下文信息,因此不涉及隐私问题的进一步好处。我们讨论了该研究对调查研究和数据收集的含义。

In the literature about web survey methodology, significant efforts have been made to understand the role of time-invariant factors (e.g. gender, education and marital status) in (non-)response mechanisms. Time-invariant factors alone, however, cannot account for most variations in (non-)responses, especially fluctuations of response rates over time. This observation inspires us to investigate the counterpart of time-invariant factors, namely time-varying factors and the potential role they play in web survey (non-)response. Specifically, we study the effects of time, weather and societal trends (derived from Google Trends data) on the daily (non-)response patterns of the 2016 and 2017 Dutch Health Surveys. Using discrete-time survival analysis, we find, among others, that weekends, holidays, pleasant weather, disease outbreaks and terrorism salience are associated with fewer responses. Furthermore, we show that using these variables alone achieves satisfactory prediction accuracy of both daily and cumulative response rates when the trained model is applied to future unseen data. This approach has the further benefit of requiring only non-personal contextual information and thus involving no privacy issues. We discuss the implications of the study for survey research and data collection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源