Abstract
We investigate the reliability of data from the Wage Indicator (WI), the largest online survey on earnings and working conditions. Comparing WI to nationally representative data sources for 17 countries reveals that participants of WI are not likely to have been representatively drawn from the respective populations. Previous literature has proposed to utilize weights based on inverse propensity scores, but this procedure was shown to leave reweighted WI samples different from the benchmark nationally representative data. We propose a novel procedure, building on covariate balancing propensity score, which achieves complete reweighting of the WI data, making it able to replicate the structure of nationally representative samples on observable characteristics. While rebalancing assures the match between WI and representative benchmark data sources, we show that the wage schedules remain different for a large group of countries. Using the example of a Mincerian wage regression, we find that in more than a third of the cases, our proposed novel reweighting assures that estimates obtained on WI data are not biased relative to nationally representative data. However, in the remaining 60 percent of the analyzed 95 data sets, systematic differences in the estimated coefficients of the Mincerian wage regression between WI and nationally representative data persist even after reweighting. We provide some intuition about the reasons behind these biases. Notably, objective factors such as access to the Internet or richness appear to matter, but self-selection (on unobservable characteristics) among WI participants appears to constitute an important source of bias.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
