Abstract
Open-ended answers in surveys capture rich motivations but are costly to code by hand. We study respondents’ stated reasons for (probable) refusal to share smartphone-sensor data, using two closely related Dutch questionnaires fielded in 2017–2018: a LISS panel study and a Statistics Netherlands (SN) consent survey. The responses in the LISS panel were coded manually, while there are no manual codes available for the SN consent survey. We transfer an 11-category motivation taxonomy from the LISS panel to the SN consent survey via a transparent NLP pipeline, using rule-based keyword extraction and elastic-net logistic regression. The manually coded responses in the LISS panel can serve as training set and offer the possibility to evaluate the responses classified by the NLP pipeline. The cross-validated AUCs applied to the LISS data are high for core categories like Privacy, Safety and Due to emotions. The results from both surveys show the following reasons for refusal: privacy dominates; control and brief refusals (without reason) follow. On the reasons per task, it follows from the consent survey, that location elicits comparatively more control/surveillance, while the house photo is most often flagged as effortful. Demographic contrasts are modest but suggestive (e.g., more effort among ages 50–67; more control/surveillance mentions among younger groups). For this application, it is assumed that for practical use, the LISS-trained classification model is transferable to the SN consent survey.
Keywords
Get full access to this article
View all access options for this article.
