Sage Journals: Discover world-class research

Abstract

The labor force surveys (LFS) of all EU countries underwent a substantial redesign in January 2021. To ensure coherent labor market time series for the main indicators in the Norwegian LFS, we model the impact of the redesign. We use a state-space model that takes explicit account of the rotating pattern of the LFS. We also include auxiliary variables related to employment and unemployment that are highly correlated with the LFS variables we consider. The results of a parallel run are also included in the model. The purpose of the article is to quantify the structural breaks due to the redesign. This article makes two contributions to the literature on the effects of redesign in surveys with a rotating panel, such as the LFS. First, we suggest a symmetric specification of the process of the wave-specific effects. Second, we account for substantial fluctuations in the labor force estimates due to the COVID-19 pandemic in the time around the LFS redesign by applying time-varying hyperparameters for both the LFS variables and the auxiliary variables. The specification with time-varying hyperparameters shows a better fit compared to the specification with time-invariant hyperparameters.

Keywords

state-space models auxiliary information labor market domains level shifts COVID-19

1. Introduction

Time series from LFS that describe the situation in the labor market are valuable for many users. These series provide important information for fiscal and monetary policy and wage bargaining in Norway, either directly or indirectly through the National Accounts. Therefore, they must be comparable over time, as it is otherwise difficult to interpret them. From time to time, it is necessary to redesign the surveys, for instance, in connection with international regulations. Such changes require correcting time series to make them comparable over time. The purpose of this article is to quantify the structural breaks in the Norwegian LFS due to a redesign.

From the beginning of 2021, the LFS in 34 European countries underwent a substantial redesign in accordance with the new regulation for integrated European social statistics (IESS). The redesign took place in all member states of the European Union, the three EFTA countries (Iceland, Norway, and Switzerland), as well as in four candidate countries (Montenegro, North Macedonia, Serbia, and Turkey). The purpose of the redesign was to standardize definitions and thereby increase comparability between countries. The redesign implied a modified questionnaire, where question sequences, formulations, and answer alternatives have changed. In addition to the changes that followed from IESS regulation, further changes were made in the Norwegian LFS. The sampling unit was changed from family to person, and indirect interviewing was ended.

How to best quantify and implement corrections related to redesign depends on the information at hand: for instance, whether one has parallel surveys or auxiliary variables at one’s disposal. One approach is to have parallel data collection where the data under the old and new survey designs are collected side by side for some period. Ideally, this should be arranged as a randomized experiment. With adequate sample sizes, estimates of structural breaks can be made by contrasting design-based sample estimates under the different designs. These estimates can be made right after the period of data collection, to get timely break estimates. This is an approach with low risk, but expensive, and requires that the survey organization manages to have two sizable surveys running at the same time.

Time series models can also be applied in connection with repeated surveys. Repeated surveys may be either non-overlapping or overlapping. In the former case, the same observational unit is only observed once and in the latter case, it is observed more than once. If one has overlapping data at hand, one may account for the autocorrelation in survey errors stemming from the fact that an observational unit occurs more than once. The main reason for using time series models is related to small sample sizes, that is, one is not willing to trust the pure estimates based on the surveys alone but prefers to draw on extra information. Such additional information is provided by the history of the target series, data from other domains and information from auxiliary variables that are correlated with the variables of concern.

For general articles about time series modeling of repeated surveys reference is made to Scott and Smith (1974) and Scott et al. (1977) (see also the survey by Pfeffermann 2022). These authors were the first to suggest using time series models not only for survey errors, but also for population values. Pfeffermann (1991) and Pfeffermann et al. (1998) made an important extension by decomposing the population value into trend, seasonal, and irregular components alongside modeling the autocorrelation in survey errors due to a design based on rotating panel data. A common approach is to use structural time series models utilizing the state-space form, see for instance Harvey (1989).

We have a narrower aim in this article since we do not intend to use time series models to estimate population values but only to obtain estimates of the structural break brought about by the redesign of the LFS. To this end, it is important to have a model that captures different properties of the time series and which also incorporates information from auxiliary time series and parallel runs for the LFS data. Intervention effects which are aimed at capturing the effect of redesigns are added to the structural time series model.

The Norwegian LFS follow a rotating design, whereby each respondent participates (in the absence of nonresponse) eight times over a two-year period, making it possible to divide the sample into eight waves. The modeling strategy follows a disaggregated approach in that the modeling is conducted for different domains. We consider four domains by distinguishing between young men and women (15–24 years), and “older” persons (25–74 years) of both sexes. The aggregated estimates are derived from this disaggregated information.

Our analysis is carried out within a structural time series framework using state-space models on monthly wave-divided data from January 2006 to October 2021. The article looks at persons aged 15 to 74, since they are the age group for which we have data both before and after the LFS redesign. We follow the tradition introduced by Pfeffermann (1991) and further developed by, for example, Van den Brakel and Krieg (2009, 2015). We model time series for both employed and unemployed persons. The modeled time series are characterized by different latent components. The time series for the eight waves are assumed to share a common trend component, a common seasonal component, and a common irregular component. Beyond the common components, two other components are added, that is, wave-specific effects and survey errors. Finally, a structural break component is included to capture the possible break due to the 2021 redesign of the LFS.

Besides the wave information, we utilize auxiliary information from registers. For the LFS estimates of for employment, we use registered employees as auxiliary information; and for the LFS estimates of for unemployment, we derive comparable unemployment estimates from the register. The auxiliary variable is assumed to have its own trend, seasonal, and irregular component. This auxiliary information is essential for identifying the effects of the redesign of the LFS since the redesign does not influence the register data and since the time series from the LFS and the register are closely interlinked. We use an auxiliary variable in conjunction with each estimation. Thus, for each domain, we consider modeling a vector with nine elements, where eight are from the LFS, and one is from the register.

The importance of having access to auxiliary variables has long been recognized for example, Tiller (1992) suggested improving the population estimates by applying auxiliary variables. The use of auxiliary variables was developed further by Harvey and Chung (2000), and the approach is used in, for example, Van den Brakel and Michiels (2021). Harvey and Chung (2000) find a common trend between LFS figures and claimant counts in the United Kingdom. Furthermore, results in Van den Brakel and Krieg (2015, 2016) indicate a common trend for the LFS and a register variable for the Netherlands. Schiavoni et al. (2021) apply a common trend for Google trends search variables in a model with LFS.

In our context, the role of auxiliary variables is deemed very important since they are correlated with the LFS variables but not impacted by the redesign of the LFS. We allow the two trend components, related respectively to the LFS and the register data, to be interlinked. This assumption concerning the trend is essential because this is the only channel through which we allow the auxiliary variables to influence the estimated hyperparameters and extracted components that one ends up with for the LFS time series. The correlation needs to be sizeable, and the later empirical analysis shows that the two trend components are closely tied together for all the considered domains for both employment and unemployment. Therefore, the LFS population estimates and the register estimates approximately share a common trend.

It is important to account for wave-specific effects (also referred to in the literature as rotating group bias), as shown already by Stephan et al. (1954), Hansen et al. (1955), and Bailar (1975). Krueger et al. (2017) show that these biases can evolve much over time. Therefore, it seems suitable to let the wave-specific components be time-varying as in Van den Brakel and Krieg (2009). This allows the rotation group bias to change from one period to the next in an unsystematic way. It is customary to treat the different waves in an asymmetric way, either by assuming that one of them is not encumbered by systemic bias or by assuming that all of them are biased and treating one of them as a residual wave.

A novel contribution of our analysis is the treatment of the wave-specific effects. In contrast to Statistics Netherlands, which measures the wave-specific effects relative to the first wave (see Van den Brakel and Krieg 2009, 2015), we follow Elliott and Zong (2019) and impose that the wave-specific effects sum to zero. However, in contrast to Elliott and Zong (2019), we do this in a symmetric way in which we do not treat one wave as residual, thereby placing less weight on it. The way one models the wave-specific component may have consequences for the estimates of the structural break and potentially to a larger degree for the smoothed estimates of other latent components.

An important question is how to model the effects of the redesign. Van den Brakel et al. (2008) and Van den Brakel and Roels (2010) apply intervention analysis for estimating structural break within the framework of structural time series models. van den Brakel and Krieg (2015) include auxiliary variables when estimating the structural break. They also incorporate information from a parallel survey to get a prior estimate of the structural break. Bollineni-Balabay et al. (2016) consider survey redesigns that lead to a structural break in both the level and variance. We account for the structural break in the levels but assume, implicitly, that the redesign does not impact variances. In our state-space model, we apply information from a small parallel survey carried out in the last quarter of 2020. This parallel survey helps us quantify the structural break for the first wave (i.e., for those participating in the survey for the first time), see also Van den Brakel and Krieg (2015).

The presence of COVID-19 induced us to allow for time-varying hyperparameters. It led to large fluctuations in the labor market. During the corona pandemic, unemployment in many countries increased sharply within a short period of time. This feature makes it problematic to assume that the population variables follow a time-invariant process. Van den Brakel et al. (2022) suggest allowing a more flexible trend in such periods and using it in preparing population estimates during the corona period for the Netherlands. Gonçalves et al. (2022) use the same type of flexible trend to produce population estimates for Brazil. A similar approach was also applied by Bollineni-Balabay et al. (2016), who considered breaks in both level and variance due to survey redesigns. When estimating the structural break due to the redesign, we pursue the same approach to accommodate the large fluctuations in the labor market. In contrast to both Bollineni-Balabay et al. (2016) and Van den Brakel et al. (2022), we include an auxiliary variable. We use time-varying hyperparameters for both the LFS variables and the auxiliary variable in a model for quantifying the structural break due to survey redesign. These time-varying hyperparameters are specified such that they allow for a more flexible trend through the COVID-19 pandemic. We consider this model augmentation as the second contribution of our article.

The remainder of this article is organized in the following way: Section 2 describes the redesign of the Norwegian LFS and presents the data used in the analysis. It describes how the monthly wave series are constructed. The section also presents the redesign of the survey in 2021 and provides information about the register data used. Finally, we describe a parallel survey carried out in the last quarter of 2020. Section 3 presents the time-series model we use to estimate the structural break due to the redesign. We comment on issues related to the state-space model used for estimation. This section also covers how we handle the redesign of the survey and how we take account of the extensive labor market fluctuations during the COVID-19 pandemic. In Section 4, we report our empirical results. Here, we also compare our empirical results with those from a model specification that does not account for higher fluctuations in the labor market during the COVID-19 pandemic. Section 5 provides some conclusions. In the Appendix (Section 6), we present our estimation of the autocorrelation parameters of the survey errors and provide a detailed specification of our state-space model. Section Supplemental Documentation contains additional information.

2. About the Data

Statistics Norway has carried out LFS since 1972. Over the years, the survey has been subject to several changes. In this article, we will use data from January 2006 (2006M1). Therefore, the description here applies to the LFS from 2006. The last observation available when the structural breaks were estimated was from 2021M10.

Subsection 2.1 presents the Norwegian LFS from 2006 (and until 2020), while Subsection 2.2 presents the changes in the LFS from 2021. The auxiliary variables are presented in Subsection 2.3, while Subsection 2.4 discusses parallel data collection in the LFS in 2020Q4. Tables 1 and 2 show descriptive statistics for both the LFS variables and the register variables as well as correlations between them, and these tables will be referred to throughout the section.

Table 1.

Descriptive Statistics of the Time Series From LFS and Register. Employed Persons.

	All domains jointly		Males 15–24 years		Males 25–74 years		Females 15–24 years		Females 25–74 years
	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020
$m e a n (y_{t}) / 10^{6}$	2.570	2.735	0.168	0.168	1.188	1.277	0.164	0.166	1.049	1.124
$m e a n (y_{t}^{1}) / 10^{6}$	2.546	2.670	0.163	0.169	1.184	1.254	0.157	0.163	1.042	1.085
$m e a n (y_{t}^{2}) / 10^{6}$	2.571	2.724	0.167	0.173	1.189	1.272	0.165	0.171	1.050	1.108
$m e a n (y_{t}^{3}) / 10^{6}$	2.573	2.733	0.168	0.172	1.189	1.263	0.167	0.171	1.050	1.127
$m e a n (y_{t}^{4}) / 10^{6}$	2.575	2.731	0.169	0.163	1.190	1.280	0.165	0.166	1.051	1.121
$m e a n (y_{t}^{5}) / 10^{6}$	2.579	2.736	0.170	0.167	1.192	1.274	0.165	0.163	1.052	1.131
$m e a n (y_{t}^{6}) / 10^{6}$	2.575	2.750	0.171	0.162	1.189	1.287	0.166	0.165	1.050	1.137
$m e a n (y_{t}^{7}) / 10^{6}$	2.568	2.770	0.172	0.174	1.185	1.281	0.165	0.167	1.047	1.147
$m e a n (y_{t}^{8}) / 10^{6}$	2.583	2.753	0.170	0.168	1.193	1.286	0.164	0.162	1.056	1.136
$m e a n (x_{t}) / 10^{6}$	2.346	2.537	0.154	0.163	1.047	1.153	0.160	0.162	0.983	1.058
$v a r (y_{t} - y_{t - 12}) / 10^{9}$	1.873	3.557	0.079	0.111	0.409	0.661	0.095	0.165	0.274	0.371
$v a r (y_{t}^{1} - y_{t - 12}^{1}) / 10^{9}$	9.745	11.720	0.844	1.020	3.145	5.875	0.779	0.829	3.267	5.416
$v a r (y_{t}^{2} - y_{t - 12}^{2}) / 10^{9}$	10.718	13.147	0.792	1.166	3.399	4.172	0.709	0.920	3.640	5.249
$v a r (y_{t}^{3} - y_{t - 12}^{3}) / 10^{9}$	9.671	7.682	0.823	0.639	2.702	3.722	0.847	0.798	3.427	3.897
$v a r (y_{t}^{4} - y_{t - 12}^{4}) / 10^{9}$	9.755	12.831	0.784	0.742	2.786	5.343	0.837	0.856	3.348	2.914
$v a r (y_{t}^{5} - y_{t - 12}^{5}) / 10^{9}$	9.121	20.019	0.943	1.127	2.970	5.428	0.694	1.093	3.033	2.172
$v a r (y_{t}^{6} - y_{t - 12}^{6}) / 10^{9}$	9.898	18.650	0.992	1.910	3.045	3.467	0.936	0.712	2.863	3.877
$v a r (y_{t}^{7} - y_{t - 12}^{7}) / 10^{9}$	8.017	24.922	0.714	1.348	2.837	4.546	0.719	1.138	3.065	6.120
$v a r (y_{t}^{8} - y_{t - 12}^{8}) / 10^{9}$	8.773	8.495	0.777	1.898	2.915	1.571	0.756	0.861	3.158	2.288
$v a r (x_{t} - x_{t - 12}) / 10^{9}$	1.611	3.117	0.022	0.085	0.389	0.378	0.015	0.105	0.202	0.320
$corr (y_{t} - y_{t - 12}, x_{t} - x_{t - 12})$	0.844	0.930	0.575	0.578	0.826	0.927	0.421	0.720	0.806	0.867

Note. $x_{t}$ is the number of employees according to register.

Table 2.

Descriptive Statistics of the Time Series From LFS and Register. Unemployed Persons.

	All domains jointly		Males 15–24 years		Males 25–74 years		Females 15–24 years		Females 25–74 years
	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020	Until 2019	From 2020
$m e a n (y_{t}) / 10^{6}$	0.097	0.133	0.020	0.025	0.036	0.049	0.015	0.022	0.027	0.037
$m e a n (y_{t}^{1}) / 10^{6}$	0.112	0.152	0.022	0.026	0.041	0.060	0.016	0.024	0.032	0.043
$m e a n (y_{t}^{2}) / 10^{6}$	0.102	0.132	0.020	0.024	0.038	0.047	0.016	0.021	0.028	0.040
$m e a n (y_{t}^{3}) / 10^{6}$	0.095	0.146	0.020	0.032	0.036	0.049	0.014	0.023	0.025	0.042
$m e a n (y_{t}^{4}) / 10^{6}$	0.095	0.127	0.020	0.024	0.035	0.047	0.015	0.018	0.025	0.037
$m e a n (y_{t}^{5}) / 10^{6}$	0.094	0.132	0.020	0.025	0.034	0.045	0.015	0.023	0.025	0.038
$m e a n (y_{t}^{6}) / 10^{6}$	0.090	0.122	0.019	0.023	0.033	0.049	0.014	0.020	0.025	0.030
$m e a n (y_{t}^{7}) / 10^{6}$	0.093	0.125	0.018	0.023	0.035	0.047	0.015	0.023	0.025	0.032
$m e a n (y_{t}^{8}) / 10^{6}$	0.097	0.131	0.018	0.022	0.037	0.050	0.015	0.022	0.027	0.037
$m e a n (x_{t}^{A}) / 10^{6}$	0.067	0.120	0.006	0.009	0.033	0.059	0.004	0.007	0.024	0.046
$m e a n (x_{t}^{B}) / 10^{6}$	0.066	0.087	0.006	0.007	0.031	0.042	0.004	0.005	0.024	0.033
$m e a n (x_{t}^{C}) / 10^{6}$	0.066	0.087	0.006	0.007	0.031	0.042	0.004	0.005	0.024	0.033
$v a r (y_{t} - y_{t - 12}) / 10^{9}$	0.281	1.159	0.028	0.028	0.090	0.307	0.022	0.042	0.044	0.149
$v a r (y_{t}^{1} - y_{t - 12}^{1}) / 10^{9}$	1.368	2.373	0.205	0.198	0.589	0.983	0.208	0.224	0.492	0.683
$v a r (y_{t}^{2} - y_{t - 12}^{2}) / 10^{9}$	1.682	2.113	0.228	0.201	0.677	0.495	0.220	0.272	0.366	1.018
$v a r (y_{t}^{3} - y_{t - 12}^{3}) / 10^{9}$	1.399	3.587	0.239	0.213	0.526	0.708	0.161	0.285	0.261	0.763
$v a r (y_{t}^{4} - y_{t - 12}^{4}) / 10^{9}$	1.409	2.347	0.227	0.194	0.508	0.690	0.162	0.192	0.272	0.404
$v a r (y_{t}^{5} - y_{t - 12}^{5}) / 10^{9}$	1.180	2.745	0.215	0.190	0.458	0.899	0.135	0.253	0.330	0.451
$v a r (y_{t}^{6} - y_{t - 12}^{6}) / 10^{9}$	1.209	2.074	0.164	0.187	0.455	1.171	0.145	0.198	0.269	0.174
$v a r (y_{t}^{7} - y_{t - 12}^{7}) / 10^{9}$	1.268	2.390	0.163	0.281	0.529	0.665	0.156	0.230	0.261	0.341
$v a r (y_{t}^{8} - y_{t - 12}^{8}) / 10^{9}$	1.411	2.474	0.174	0.139	0.512	0.805	0.186	0.380	0.352	0.638
$var (x_{t}^{A} - x_{t - 12}^{A}) / 10^{9}$	0.117	9.810	0.002	0.059	0.042	2.190	0.000	0.077	0.009	1.304
$var (x_{t}^{B} - x_{t - 12}^{B}) / 10^{9}$	0.095	0.978	0.002	0.009	0.031	0.210	0.000	0.009	0.008	0.127
$var (x_{t}^{C} - x_{t - 12}^{C}) / 10^{9}$	0.096	0.882	0.002	0.008	0.031	0.191	0.000	0.007	0.009	0.115
$corr (y_{t} - y_{t - 12}, x_{t}^{A} - x_{t - 12}^{A})$	0.582	0.291	0.278	0.015	0.570	0.328	0.223	−0.116	0.333	0.355
$corr (y_{t} - y_{t - 12}, x_{t}^{B} - x_{t - 12}^{B})$	0.577	0.720	0.273	0.145	0.568	0.722	0.220	0.089	0.341	0.734
$corr (y_{t} - y_{t - 12}, x_{t}^{C} - x_{t - 12}^{C})$	0.562	0.804	0.254	0.226	0.558	0.773	0.213	0.115	0.340	0.830

Note. $x_{t}^{A}$ is the register series for unemployed persons from the Norwegian Labour and Welfare Administration (NAV), measured at the end of period t (typically the last Monday in the month): $x_{t}^{B}$ is the register series for unemployed persons from NAV adjusted for temporary layoffs less than ninety days registered at NAV, that is, $x_{t}^{B} = x_{t}^{A} - x_{t}^{l a y o f f s}$ ,where $x_{t}^{l a y o f f s}$ is the number of temporary layoffs less than ninety days according to NAV; $x_{t}^{C}$ is the average of $x_{t}^{B}$ near the beginning and end of the month, that is, $x_{t}^{C} = (x_{t - 1}^{B} + x_{t}^{B}) / 2$ .

2.1. The Norwegian LFS

LFS measures key labor market indicators in the population, such as employment and unemployment. In our observation period, data collection has been carried out by means of telephone interviews only.

The Norwegian LFS has a rotating panel design. Since 1996, participants have been requested to respond every three months for a total of eight consecutive quarters. In each quarter, 1/8 of the sample leaves the survey after finishing the last wave, and an equivalent number of new interviewees are included for the first time. First-time interviewees constitute wave 1, those interviewed for the second time wave 2, and so on. Those interviewed for the last time are thus those in wave 8. The sample size in the survey is around 24,000 persons per quarter.

The nonresponse rate in the Norwegian LFS varied from 14 to 21% in the years 2016 to 2020, see Eurostat (2022), and the response rates are almost the same for all waves. Eurostat (2022, Table 4.5) reports nonresponse rates of the member states of the European Union, three EFTA countries (including Norway) and four candidate countries. The rates are not comparable as the magnitude of nonresponse is based on household units for most countries. For Norway, like Denmark, Estonia, Luxembourg, Finland, Sweden, Iceland, and Switzerland, the figures are for nonresponse at an individual level. Of these countries, Norway has the lowest nonresponse rate, while Switzerland has the second-lowest nonresponse rate at around 20%. The majority of other countries that calculate the nonresponse rate in the same way as Norway have nonresponse rates of 30 to 50%.

Before 2021, the sampling unit was the core family. The core family consists of married couples and registered same-sex couples with their children, and single-parents with children. Other units were considered to be single-person families. In this period, the survey used a stratified, one-stage cluster design. Individuals were clustered into family units, and the sample was stratified using geographical areas. The interviewees are the persons in the age group 15 to 74 in these families, and each of them is interviewed.

The responses from the LFS participants are assigned weights based on how representative they are for the total population. These weights are assigned to the individual person (not the sampled family) and used to estimate the LFS variables. The estimation procedure for the Norwegian LFS is a one-step multiple-model calibration based on monthly LFS and register data. The method uses register data for employment status, age, sex, NUTS2 region, immigration background, education level, family size, and marital status. The method is described further in Oguz-Alper (2018); see also Nguyen and Zhang (2020).

Let $w_{i}$ be the calibrated weight for person i and let $z_{i}$ be an indicator taking the value 1 if person i has a particular labor market status, for example, is unemployed, and 0, otherwise. The direct estimate for the number of persons in a domain having this labor market status, for example being unemployed, is then given as

y = \sum_{i \in s} w_{i} z_{i}

(1)

where we have omitted subscripts for time and domain for all variables to simplify the notation, and s represents the set of persons that have responded to the survey.

We apply the same weights as for the overall estimates to generate the wave-specific estimates for employment and unemployment used in this analysis. Let $δ_{i}^{j}$ be an indicator, where $δ_{i}^{j} = 1$ if person i is in wave j and $δ_{i}^{j} = 0$ otherwise. Then the direct estimate for the number of, say, unemployed persons in a domain based on the respondents in wave j only, is given by

y^{j} = \frac{\sum_{i \in s} w_{i}}{\sum_{i \in s} w_{i} δ_{i}^{j}} \sum_{i \in s} w_{i} z_{i} δ_{i}^{j}

(2)

The first line in Table 1 reports the mean of the number of employed persons according to the LFS given by Equation (1) for two subperiods, that is, for the period 2006M1-2019M12 and the remaining sample period 2020M1-2021M10, as we suspect the variances to be significantly higher in the second subperiod due to the presence of the COVID-19 pandemic. We also provide estimates for four domains. These domains are based on two age groups for both sexes. We distinguish between young persons aged 15 to 24 years and persons aged 25 to 74 years, which is an age classification used by both Eurostat and Statistics Norway. Most of the employed individuals of both sexes are in the “older” age group.

Similarly, the first line in Table 2 reports the mean value of unemployed persons given by Equation (1) for the same time periods and domains that are as used in Table 1. When considering the estimates of unemployment according to the LFS, we also see that most are in the oldest age groups. However, the unemployed are more evenly distributed amongst the groups than the employed. Thus, the unemployment rate, which is not reported in the table, is lower for the “older” age groups.

Tables 1 and 2 also report the mean value of the wave-specific estimates according to Equation (2). The wave-specific estimate is especially pronounced for wave 1 for all domains, with lower employment and higher unemployment than the average.

In the lower parts of Tables 1 and 2, we report the empirical variance of the twelve-month growth in employment and unemployment according to the LFS for each wave and for the mean of the waves. The variance for a specific wave is substantially larger than the variance of the mean of the waves. As noted for the means above, the variances of the register variables are less than those of the LFS variables.

2.2. The Most Important Changes in the 2021- Redesign of the Norwegian LFS

In the beginning of 2021, some changes were made in the Norwegian LFS. The main reason for the restructuring is the new EU regulation. The changes are intended to improve the quality of statistics, increase compar ability across countries, and across domains in social statistics. Therefore, a similar restructuring of the LFS has taken place in all EU and associated countries. The sampling design was also changed in the Norwegian LFS, even though this was not a requirement from Eurostat. From 2021, the sampling unit was changed from core family to individual persons, and the population is stratified by combinations of region, age-group, and register-based employment status; see Jentoft (2022) for details. The direct estimates in Equations (1) and (2) still apply, where the weights are based on basically the same estimation procedure as before 2021; see Oguz-Alper (2023) for details.

The redesign also means that the target population was changed from covering all registered residents aged 15 to 74 to registered residents aged 15 to 89 in private households. This means that more age groups are included in the survey, but also that some persons are excluded from the target populations as they do not live in private households. The most important examples of the latter are persons enrolled in compulsory military service and persons registered as residents in institutions. Until the beginning of 2021, persons in the same family could answer for other family members. Due to the change of the sampling unit to individual person, Statistics Norway has stopped using such proxy interviewing. This change may have led to higher nonresponse, especially from younger persons, but this should at least partly be compensated for by weighting as the weights are calibrated on, among other variables, detailed age groups and register-based employment status. See Zhang et al. (2013) for a discussion of proxy interviewing in the Norwegian LFS and reduction of nonresponse bias through weighting when good auxiliary register variables are available.

In the new questionnaire, question sequences, formulations and response options have changed due to modernisation of the language, increased international coordination, and adapted self-reporting as a future data collection method.

The alternative to an immediate introduction of a new questionnaire in January 2021 would be a gradual introduction. Such a gradual introduction of the new questionnaire was done in the Netherlands, see Van den Brakel (2022). The advantage of a gradual phasing in of the new questionnaire is that one can get better estimates of the structural break, since one can compare changes in waves where the questionnaire is changed with waves where the questionnaire is not changed. But this means that both new and old LFS questionnaires are used during a transition period, and systems must be in place that can handle a double set of questionnaires. Statistics Norway therefore chose to change the questionnaires for all waves at the same time.

Before 2021, individuals regarded as temporary layoffs for more than ninety days were automatically considered unemployed in the Norwegian LFS without being asked about active job search or availability. From 2021 on, temporary layoffs for more than ninety days will get the usual questions about job search and availability in the LFS, thus potentially being classified as outside the labor force. This change in the questionnaire, combined with the fact that the Norwegian labor market at the same time was facing a situation with many temporary layoffs in connection with the COVID-19 pandemic, could reduce the number of unemployed when the new LFS design replaces the old one.

2.3. Register Data, Harmonization and Pre-Adjustment for Earlier Structural Breaks

In the time series model for employed persons according to the LFS we utilize a time series for the number of registered employees in the domain. Similarly, the time series model for LFS unemployment in a domain utilizes an auxiliary register time series for that domain from the unemployed registered at the employment office (registered unemployment). The auxiliary register variable in time series models needs to be comparable over time and should not include structural breaks, at least not at the same time as the 2021 redesign.

With respect to the LFS employment model, for the period before 2015, we use register information from the Register of Employers and Employees. In January 2015, the Register of Employers and Employees was replaced by the new a-scheme register for monthly reporting of employee and payroll information to the Norwegian Labour and Welfare Administration (NAV), the Norwegian Tax Administration and Statistics Norway. Since the auxiliary variable is based on two different sources of information in the start and the end of the sample, brought about by the transition from the Register of Employers and Employees to the new a-scheme register, it has been corrected for changed level and seasonal patterns (see also Part B of our Supplemental Documentation).

From Table 1 we see that there is a high correlation between the development in LFS employment and registered employment. This correlation has been particularly high since 2020, with a correlation coefficient of 93% for the full sample. The correlation coefficients for the domains are somewhat lower but still exceed 80% for the “older” domains for both males and females. For young males, the correlation coefficient is about 60%, and for young females, about 70%.

In the model for unemployment, we use estimates for persons registered by NAV as unemployed. In contrast to the LFS estimates, in the register from NAV, temporary layoffs are regarded as unemployed from the first day. Due to different treatment of temporary layoffs, there is a large discrepancy in the observed relationship between LFS unemployed and the official registered unemployed estimates for the first couple of months of the COVID-19 pandemic in Norway, starting in March 2020. Therefore, “layoff-harmonized” registered unemployed estimates have been constructed by excluding temporary layoffs from the official NAV figures in the first three months. This harmonization brings the definition more into line with the definition of LFS unemployment because the LFS treat temporary layoffs as employed temporarily absent for the first ninety days. This harmonization of the register variables is designed to bring about a higher correlation between the growth in the register and LFS variables.

The official NAV unemployment estimates indicate the number of registered unemployed close to the end of the month. For our auxiliary register variable to be more representative of the monthly average of unemployed according to the LFS, we use the average of the auxiliary register variables observed close to the end of the month in question and the end of the previous month. This averaging of our pre-adjusted harmonized register unemployment variable is vital in months with large changes in unemployment, such as for the start of the initial shutdown period of the COVID-19 pandemic in Norway in March 2020.

Table 2 reveals the advantages of our adjustments of the registered unemployment series. When observations from 2020M1 till 2021M10 are considered for all domains together, the twelve-month growth in official NAV unemployment series shows a correlation of 29.1% with the growth in the corresponding LFS series. This correlation coefficient increases to 72.0% when we adjust for layoffs. When this adjusted unemployment according to register is measured as a two-month average, the correlation coefficient increases even further, to 80.4%. We see the same pattern for all domains. Due to our adjustments, the correlation coefficient for “older” males increases from 32.8 to 77.3%. The correlation for “older” females increases from 35.5 to 83.0%. For young males and females, the correlation between the register variables and the LFS variables is appreciably smaller. However, our adjustments increase the correlation for these domains, too.

2.4. Information From Parallel Data Collection in 2020Q4

The results of a parallel data collection may help in a time series model to produce more precise estimates of the effect due to redesigning a survey. In the last quarter of 2020, a sample of 2,626 persons were interviewed using the new questionnaire. The persons in this extra sample were only interviewed once. The results of these interviews can be compared with the results from wave 1 of the ordinary LFS interviews when using the old questionnaire. This will give an estimate of the structural break for wave 1 together with an estimate of its variance.

The extra sample is too small for the effects of the 2021 redesign to be estimated precisely. However, the information can still be combined with a time series model to model the effects of the 2021 LFS redesign. This approach is discussed in Van den Brakel et al. (2020). Because the sample is small, we cannot use the usual calibration model. Therefore, a simplified version of the calibration model is used for deriving weights. A more detailed description of the parallel survey and this calibration model is given in Part G of the Supplemental Documentation.

3. Time Series Model for Estimating Possible Overall Structural Breaks Due to the 2021 LFS-Redesign

In Subsection 3.1, we outline the basic model for the Norwegian LFS. Subsection 3.2 presents our first contribution, which is the symmetric treatment of wave-specific effects. In Subsection 3.3, the model is extended to include a structural break and an auxiliary variable. The article’s second contribution is presented in Subsection 3.4, where we allow for a time-varying hyperparameter for the trends for both the LFS variables and the auxiliary variables in order to account for the substantial fluctuations in the labor force during the pandemic when estimating the structural break.

3.1. The Basic State-Space Model of the Norwegian LFS

In this section, we reasonably assume that all eight waves follow the same trend, have the same seasonal pattern and irregularities, and have an autocorrelated survey error component because of the rotating design. Pfeffermann (1991) derives a model for such a repeated survey.

We define $y_{t}^{i}$ , where $i = 1, 2, \dots, 8$ , as the unemployment estimate (or the employment estimate) based on the observations in wave i of the LFS survey. Furthermore, let $Y_{t} = {(y_{t}^{1}, y_{t}^{2}, \dots, y_{t}^{8})}^{'}$ be the vector of the estimates of all eight waves. The model we use as a starting point is

Y_{t} = 1_{8} θ_{t} + λ_{t} + e_{t},

(3)

where $1_{8}$ is a column vector of 8 ones, $θ_{t}$ is an estimate of the “true” LFS unemployment (or employment), the vector $λ_{t} = {(λ_{t}^{1}, λ_{t}^{2}, \dots, λ_{t}^{8})}^{'}$ represents the time-varying wave-specific effects, and $e_{t} = {(e_{t}^{1}, e_{t}^{2}, \dots, e_{t}^{8})}^{'}$ is the vector of wave-specific survey errors. Furthermore, the “true” LFS estimate is decomposed as

θ_{t} = L_{t} + S_{t} + I_{t}

(4)

where $L_{t}$ is the trend, $S_{t}$ the seasonal, and $I_{t}$ the irregular component. Below, we describe the processes for these three components and the wave-specific survey errors. The process for the wave-specific effects is presented in the next section.

The trend is generally assumed to follow a local level model, a local linear trend model, or a smooth trend model; see for example, Harvey (1989) and Durbin and Koopman (2012). We follow Van den Brakel and Krieg (2009) and apply the smooth trend model

L_{t} = L_{t - 1} + R_{t - 1}, R_{t} = R_{t - 1} + w_{t - 1}, w_{t} ~ N (0, σ_{R}^{2}),

(5)

where $σ_{R}^{2}$ is the hyperparameter of the slope of the trend. A high value of this hyperparameter implies a flexible trend.

The seasonal component, $S_{t}$ , is often specified as a deterministic seasonal model, a dummy seasonal model, or a trigonometric seasonal model; see among others Harvey (1989, 41–43), Durbin and Koopman (2012), and Hindrayanto et al. (2013). With monthly data, the trigonometric seasonal model is given as

\begin{array}{l} S_{t} = \sum_{j = 1}^{6} γ_{j, t} \\ γ_{j, t} = γ_{j, t - 1} \cos (π j / 6) + γ_{j, t - 1}^{*} \sin (π j / 6) + ω_{j, t} ω_{j, t} \sim N (0, σ_{ω}^{2}) \\ γ_{j, t}^{*} = γ_{j, t - 1}^{*} \cos (π j / 6) - γ_{j, t - 1} \sin (π j / 6) + ω_{j, t}^{*} ω_{j, t}^{*} \sim N (0, σ_{ω}^{2}) \\ j = 1, 2, \dots, 6 \end{array}

(6)

The first frequency of π/6, that is, the fundamental frequency, corresponds to a period of twelve months, whereas the five other frequencies are harmonics. We note that this process depends on only one hyperparameter, as the variance $σ_{ω}^{2}$ is assumed to be common to all disturbance terms. This is a restriction commonly used for these hyperparameters; see for example, Harvey (1989, 43). If this hyperparameter is small, the seasonal pattern does not change much from one year to the next.

The irregular component $I_{t}$ is given by

I_{t} ~ N (0, σ_{I}^{2}),

(7)

where $σ_{I}^{2}$ is a hyperparameter.

The interviewees in the first wave are interviewed for the first time, whereas the interviewees in the other waves have been interviewed before. The variance of the wave-specific survey errors is also time-dependent, partly due to variation in the number of persons interviewed each month; see Binder and Dick (1990) and Van den Brakel and Krieg (2009). Let $k_{t}^{j} = \sqrt{\hat{V a r} [y_{t}^{j}]}$ be an estimate of the standard error of the survey error for wave j in period t. The survey errors are modeled as:

e_{t}^{j} = k_{t}^{j} {\tilde{e}}_{t}^{j} w h e r e {\tilde{e}}_{t}^{1} = ε_{t}^{1} w i t h ε_{t}^{1} \sim N (0, σ_{e_{1}}^{2})

a n d {\tilde{e}}_{t}^{j} = ϕ {\tilde{e}}_{t - 3}^{j - 1} + ε_{t}^{j} w i t h ε_{t}^{j} \sim N (0, σ_{e}^{2}) f o r j = 2, 3, \dots, 8 .

(8)

If $k_{t}^{j}$ is a “good” estimate of the standard deviation of the survey error, ${\tilde{e}}_{t}^{j}$ will have an estimated variance in the vicinity of one. However, we do not restrict the variances to be equal to unity. Instead, we impose the weaker restriction that $V a r ({\tilde{e}}_{t}^{2}) = \dots = V a r ({\tilde{e}}_{t}^{8}) .$

Based on Equation (2) and assuming equal weights, we apply a rough approximate estimate of the variance of survey error for the wave-specific monthly LFS-estimates given by

\hat{V a r} [y_{t}^{j}] = N_{t}^{2} {\hat{p}}_{t} (1 - {\hat{p}}_{t}) / n_{t}^{j}

(9)

as an estimate for the square of $k_{t}^{j}$ where $n_{t}^{j}$ is the number of persons who have responded from wave j, $N_{t}$ is the population size; and ${\hat{p}}_{t} = (\frac{1}{8} \sum_{j = 1}^{8} y_{t}^{j}) / N_{t}$ is the estimated proportion for an LFS variable based on information from all eight waves.

The autocorrelation coefficient in Equation (8), $ϕ$ , is estimated based on pseudo errors, which was also the starting point of Pfeffermann et al. (1998). The procedure for estimating the autocorrelation coefficient is outlined in the Appendix (Subsection 6.1). The estimate obtained is plugged into our state-space model before the remaining parameters are estimated.

3.2. Symmetric Treatment of the Wave-Specific Effects

Investigating the responses from the US current population survey (which corresponds to the LFS in many other countries), Bailar (1975) shows that the number of persons reporting as unemployed is much higher for those participating in the survey for the first time. Similar results for the US current population survey are also found in Stephan et al. (1954, A-80) and Hansen et al. (1955, 710), and in Kumar and Lee (1983) for the Canadian LFS. Pfeffermann (1991) takes account of this in his model for repeated surveys by including wave-specific effects. However, the model only takes account of time-invariant wave-specific effects (although he mentions that the model can be extended to allow for time-varying wave-specific effects). Van den Brakel and Krieg (2009) extend the model to include time-varying wave-specific effects.

For both the trend component in $θ_{t}$ and the wave-specific effects to be identifiable, a restriction must be imposed on the wave-specific effects. Van den Brakel and Krieg (2009) assume that the estimate of the unemployment rate from the first wave is unbiased. Thus, they apply the restriction $λ_{t}^{1} = 0$ for all t.

In contrast, we apply the restriction $1_{8}^{'} λ_{t} = 0$ , that is, the sum of the wave-specific effects is zero in every period. Elliott and Zong (2019) also apply such a restriction and justify why they believe this implies $θ_{t}$ to be an unbiased estimate of the “true” LFS figure. We have no evidence that indicates to us that we should place more emphasis on one wave over the others. That is why we place equal emphasis on all waves. However, we do not assume that the average of the waves gives an unbiased estimate of the “true” LFS figure. Instead, we only assume this bias to be time-invariant, except for the possible break in 2021. The 2021 redesign should not change the “true” LFS figure, but it may lead to a change in the observed LFS figures. This implies a change in the bias, so if there is a structural break in 2021 the average of the waves cannot be an unbiased estimate of the “true” LFS figure both before and after 2021 (See also a similar discussion in Bollineni-Balabay et al. 2016).

The restriction $1_{8}^{'} λ_{t} = 0$ is usually imposed by restricting one of the components in $λ_{t}$ , for example, the last one, to be equal to the negative sum of the others and allowing the remaining ones to follow independent random walks (see, e.g., Elliott and Zong 2019). However, this will often lead to a large variance in the wave-specific effect for the wave that ensures that the restriction holds. For example, if we have $λ_{t}^{j} = λ_{t - 1}^{j} + η_{t}^{j}$ with $η_{t}^{j} \sim N (0, σ_{λ}^{2})$ for $j = 1, 2, \dots, 7$ , and $λ_{t}^{8} = - \sum_{j = 1}^{7} λ_{t}^{j}$ , then $V a r (λ_{t}^{j} - λ_{t - 1}^{j}) = σ_{λ}^{2}$ for $j = 1, 2, \dots, 7$ but $V a r (λ_{t}^{8} - λ_{t - 1}^{8}) = 7 σ_{λ}^{2}$ . Thus, with this specification, the variance of the wave-specific effect for the residual wave is seven times as high as for each of the seven other waves. Generally, with m waves, the variance of the residual wave will be m − 1 times larger than each of the other m − 1 waves. This specification will imply a smaller weight for this wave in the estimate of the change in $θ_{t}$ .

To avoid the process of one of the wave-specific effects having a higher variance than the other, we apply a symmetric approach;

λ_{t} = λ_{t - 1} + η_{t}, η_{t} \sim N (0_{8} (I_{8} - \frac{1}{8} 1_{8} {1_{8}}^{'}) σ_{λ}^{2}), 1_{8}^{'} λ_{0} = 0

(10)

where $I_{8}$ is the identity matrix of dimension 8. Note that $η_{t}$ has a singular covariance matrix. The specification in Equation (10) ensures that $1_{8}^{'} λ_{t} = 0$ for all t. Furthermore, the specification implies that the wave-specific effects follow random walks that now are dependent due to the negative correlations between elements in $η_{t}$ . The representation in Equation (10) is similar to the representation for seasonal effects in Harrison and Stevens (1976); see also Proietti (2000) and Harvey (2006). Proietti (2000) discusses the similarity between the trigonometric seasonal model in Equation (6) and a seasonal model in the form of Equation (10).

The specification of the wave-specific effects in Equation (10) might not be easy to implement in a software program for state-space models. The restriction $1_{8}^{'} λ_{t} = 0$ implies that there are seven independent variables in $λ_{t}$ . Therefore, we introduce the eight times seven matrix $J^{*}$ and the seven-dimensional vector $λ_{t}^{*}$ of the seven independent variables in $λ_{t}$ , such that we have $λ_{t} = J^{*} λ_{t}^{*}$ . The process of these seven independent variables can be formulated as seven independent random walks:

λ_{t}^{*} = λ_{t - 1}^{*} + η_{t}^{*}, η_{t}^{*} \sim N (0_{7}, I_{7} σ_{λ}^{2})

(11)

Note that if we premultiply (11) with $J^{*}$ , we get (10) if $J^{*} {J^{*}}^{'} = I_{8} - \frac{1}{8} 1_{8} 1_{8}^{'}$ . This will be the case if we choose $J^{*} = J {(J^{'} J)}^{- 1 / 2}$ with $J = {(I_{7}, - 1_{7})}^{'}$ . The hyperparameter $σ_{λ}^{2}$ indicates how flexible the wave-specific effects are; if $σ_{λ}^{2} = 0$ we have time-invariant (i.e., constant) wave-specific effects.

To derive $J^{*}$ , we note that $S = J^{'} J$ is both symmetric and positive definite. A symmetric matrix can be decomposed using eigen-decomposition as $S = V Λ V^{'}$ , where $Λ$ is a diagonal matrix holding the eigenvalues and $V$ a matrix with the corresponding eigenvectors. As S is also positive definite, we have $S^{n} = V Λ^{n} V^{'}$ where n is any real number. Here we apply this for $n = - 1 / 2 .$

3.3. Structural Break and Auxiliary Variables

We now extend our model to allow for a possible structural break following Harvey and Durbin (1986). When a structural break is included, (3) changes to

Y_{t} = 1_{8} θ_{t} + λ_{t} + β 1_{t \geq 2021 M 1} + e_{t}

(12)

In Equation (12), $1_{t \geq 2021 M 1}$ is a dummy variable that changes from zero to one when the survey changes from the old to the new design in January 2021. The eight-dimensional vector with regression coefficients, $β = {(β^{1}, β^{2}, \dots, β^{8})}^{'}$ represents the structural break for each wave. When preliminary break estimates for wave 1 are available from a parallel survey (see Subsection 3.4), this information can be used in the initialization of $β^{1}$ ; see also the Appendix (Subsection 6.2).

Without a gradual phasing in of the new questionnaire, the use of auxiliary variables become even more important. We include auxiliary variables in the models to get a better grasp on quantifying the structural breaks. If $X_{t}$ is such a variable (e.g., unemployment information from a register, or employment information from a register), we can decompose it as:

X_{t} = θ_{t}^{X} = L_{t}^{X} + S_{t}^{X} + I_{t}^{X},

(13)

where $L_{t}^{X}$ , $S_{t}^{X}$ , $I_{t}^{X}$ are scalars and denote the trend, seasonal, and irregular components of the auxiliary variable, respectively. They are modeled similarly to the corresponding components of the LFS variables in Equations (5) to (7). Van den Brakel and Krieg (2015) suggest constructing a model in which the vector $Y_{t}$ and the scalar $X_{t}$ are modeled jointly. This joint system can be formulated as

(\begin{matrix} Y_{t} \\ X_{t} \end{matrix}) = (\begin{matrix} 1_{8} θ_{t}^{L F S} \\ θ_{t}^{X} \end{matrix}) + (\begin{matrix} λ_{t} \\ 0 \end{matrix}) + (\begin{matrix} β \\ 0 \end{matrix}) 1_{t \geq 2021 M 1} + (\begin{matrix} e_{t} \\ 0 \end{matrix}),

(14)

where the superscript LFS is included to emphasize that these latent processes and parameters are related to the LFS.

For it to be advantageous to model the LFS variables (LFS unemployment or LFS employment) and the register variable jointly, there must be a correlation between them. Therefore, we allow the disturbance terms of the trend slope components of the LFS variables and the auxiliary variable, the X-variable, to be correlated. The covariance between the disturbance terms of the slopes of LFS variable trend and the register variable trend is given by

Cov (w_{t}^{L F S}, w_{t}^{X}) = ρ_{R}^{L F S, X} σ_{R}^{L F S} σ_{R}^{X},

(15)

where $σ_{R}^{L F S}$ is the square root of $σ_{R}^{2}$ in Equation (5), which is the variance parameter related to the slope of the LFS trend, $σ_{R}^{X}$ is the square root of the variance parameter of the slope of the register variable trend, and $ρ_{R}^{L F S, X}$ is the correlation between the slope disturbances of the two trends.

3.4. Larger Fluctuation in the Trend During COVID-19

The COVID-19 pandemic led to large fluctuations in the labor market. The model we have outlined above does not allow for larger fluctuations in the labor market during the pandemic. The structural break estimates may be severely biased if this increased variation in the LFS and register time series is neglected.

In the Netherlands, the Labor Force Survey estimates are improved by applying a state-space model; see Van den Brakel and Krieg (2009). During the COVID-19 pandemic, they had to modify the state-space model to account for the more rapid changes in the labor market; see Van den Brakel et al. (2022). They did so by allowing for a time-varying hyperparameter for the slope of the trend. Gonçalves et al. (2022) applied the same approach for the LFS in Brazil. However, neither Van den Brakel et al. (2022) nor Gonçalves et al. (2022) considered to include an auxiliary variable. Bollineni-Balabay et al. (2016) applied a similar approach when considering both level and variance breaks due to survey redesign. However, they did not include an auxiliary variable. Here, we use similar modeling of both the LFS and the register trend.

L_{t}^{i} = L_{t - 1}^{i} + R_{t - 1}^{i}, R_{t}^{i} = R_{t - 1}^{i} + ψ_{t - 1}^{1 / 2} w_{t - 1}^{i}, w_{t}^{i} ~ N (0, {(σ_{R}^{i})}^{2}), i = L F S, X .

(16)

The specification in Equation (16) implies that the slope-variance hyperparameter is time-varying and given by $ψ_{t} {(σ_{R}^{i})}^{2}$ . Furthermore, combined with Equation (15), our model implies that the correlation between the slope of LFS trend and the register trend is time-invariant even if the variances are time-varying. In this way, we can have flexible trends with larger fluctuations during the COVID-19 pandemic, while at the same time drawing strength from the auxiliary variable by restricting the correlation between the disturbances of the trend slopes to be constant. Such a restriction of time-invariant correlation contributes to the identification of the structural break. Without such a restriction, many observations are needed to estimate a new correlation structure, which can be problematic since the corona period is short. See also Part C of the Supplemental Documentation for a further discussion of Equation (15) with time-varying hyperparameters.

We have divided our sample into three parts. The first is the pre-corona part, defined as the period up to and including 2019M12. In this period, we apply $ψ_{t} = 1$ , such that $σ_{R}^{2}$ is the variance of the slope in the pre-corona period. The second is the initial shutdown part of the COVID-19 pandemic, with large fluctuations in labor force figures. Due to the dynamic specification of the trend in Equation (16), the hyperparameter for the slope must increase at least two months before the shock occurs. In Norway, the COVID-19 shutdown took place in March 2020, so we must allow for the change in the hyperparameter in January 2020. Results in Van den Brakel et al. (2022) support that the hyperparameter for the slope should be increased two months before the COVID-19 shutdown. Furthermore, Gonçalves et al. (2022) find that an increased hyperparameter for the slope in the six first months of 2020 improves the fit of a model for LFS in Brazil. For Norway, the first half of 2020, that is, 2020M1-2020M6, covers partly the shutdown period of the pandemic. For this period, we restrict $ψ_{t}$ to take the same value in all months, that is, $ψ_{t} = ψ^{1}$ for $t = 2020 M 1, 2020 M 2, \dots, 2020 M 6$ . The last part of the sample, the recovery period, starts in 2020M7. In this period, there were still larger fluctuations than before the COVID-19 pandemic, but not as large as when the pandemic first hit the Norwegian economy. For this period, which applies to the remainder of our sample, we also restrict $ψ_{t}$ to take the same value in all months, that is, $ψ_{t} = ψ^{2} for t = 2020 M 7, 2020 M 8, \dots, 2021 M 10$ . We apply a grid search to estimate the parameters $ψ^{1}$ and $ψ^{2}$ , specified in the beginning of Section 4.

3.5. Estimation and Statistical Inference

We cast our (parsimonious) models in state-space form and estimate their hyperparameters by maximizing the diffuse log-likelihood function using the BFGS algorithm. The formal specification of the state-space model with all the underlying assumptions is given in the Appendix (Subsection 6.2). Special features of our state-space models are that there are no measurement errors in the measurement (vector) equation, the transition matrices are always time-invariant, and the selection matrices of the transition equations are potentially time-varying, see also the Appendix (Subsection 6.2). The main purpose of our article is to investigate whether the redesign of the LFS survey impacts employment and unemployment. The intervention effects are assumed to be wave-specific and constant. Technically, they are represented by elements in the state vector that are without disturbances.

An essential part of the estimation algorithm is to run the Kalman filter during the recursions in order to update the state vector estimate. KFAS, see Helske (2017), utilizes a complete univariate approach for filtering and smoothing provided by Koopman and Durbin (2003); see also Anderson and Moore (1979) for sequential processing. This constitutes a way of implementing so-called exact diffuse initialization. Such a procedure makes the results less prone to numerical error than when uninformative diffuse priors are used. An important aspect of our study is to compare model specifications with time-invariant hyperparameters with model specifications that allow for time-varying hyperparameters. To this end, we use likelihood ratio tests.

After obtaining the maximum likelihood estimates of our unknown hyperparameters, we obtain (final) smoothed estimates of the state vectors. Diagnostics of the state space model can be constructed from the standardized one-step-ahead prediction errors.

4. Empirical Results

This section presents the estimated hyperparameters and the structural break estimates due to the 2021 LFS-redesign. The models are estimated on monthly data from 2006M1 to 2021M10. Apart from the pre-estimation of the autocorrelation parameters related to the survey error component, all other inference has been carried out using the R package KFAS, see Helske (2017).

Following Pfeffermann et al. (1998), we estimate the autocorrelation coefficient of the survey errors in a separate system; see the Appendix (Subsection 6.1). By doing so, we can treat the coefficient as “known” when estimating the remaining parameters of the state-space model. We also apply a grid search to estimate $ψ^{1}$ and $ψ^{2}$ with the following values: $ψ^{1} = 16, 25, 49$ and $ψ^{2} = (ψ^{1} + ψ^{s t e p} - 1) / ψ^{s t e p}$ , where $ψ^{s t e p} = 2, 4, 6, 8, 12, 24, 48$ . This specification secures that $1 < ψ^{2} < ψ^{1}$ . For each pair of values for $ψ^{1}$ and $ψ^{2}$ , we estimate the remaining parameters of the state-space model and calculate the log-likelihood value. The estimates of $ψ^{1}$ and $ψ^{2}$ are given by the pair of values that leads to the highest log-likelihood value. The plug-in of an estimate of $ϕ$ together with the grid search relating to the two scaling parameters $ψ^{1}$ and $ψ^{2}$ entail that the standard errors of the estimates of the structural break parameters are underestimated.

4.1. Estimated Hyperparameters and Other Results

Tables 3 and 4 provide an overview of the maximum likelihood estimates of the hyperparameters for employment and unemployment, respectively. In the tables, we consider both the case with estimated parameters $ψ^{1}$ and $ψ^{2}$ and the case where these parameters are fixed a priori at 1. The former specification allows the variance of the disturbances of the slope component of the trend to be time varying. For both employment and unemployment, we have tested the joint hypothesis of time-invariant hyperparameters, that is, $ψ^{1} = ψ^{2} = 1$ , using likelihood ratio tests. It is clearly rejected for all domains of both employment and unemployment. All the eight p-values are less than .0001.

Table 3.

Estimated Hyperparameters. Employed Persons.

Hyperparameters	With allowance for time-varying variances for the disturbances of the slope components				Without allowance for time-varying variances for the disturbances of the slope components
Hyperparameters	Male 15–24	Male 25–74	Female 15–24	Female 25–74	Male 15–24	Male 25–74	Female 15–24	Female 25–74
$ψ^{1}$	25	16	25	49	1	1	1	1
$ψ^{2}$	13	2.875	13	13	1	1	1	1
${(σ_{R}^{L F S})}^{2} / 10^{6}$	0.017	0.291	0.023	0.033	0.306	1.500	0.294	1.969
${(σ_{ω}^{L F S})}^{2} / 10^{3}$	15.225	12.694	5.549	0.011	16.223	12.245	3.749	0.000
${(σ_{I}^{L F S})}^{2} / 10^{6}$	1.218	0.000	4.532	1.380	1.405	0.000	4.595	0.318
${(σ_{λ}^{L F S})}^{2}$	1.033	2.846	9.269	42.109	138.436	0.000	32.223	0.000
$σ_{e_{1}}^{2}$	1.150	1.181	1.058	1.314	1.151	1.177	1.048	1.309
$σ_{e}^{2}$	0.713	0.538	0.694	0.455	0.713	0.541	0.689	0.455
${(σ_{R}^{X})}^{2} / 10^{6}$	0.023	0.453	0.036	0.047	0.441	2.666	0.680	2.495
${(σ_{ω}^{X})}^{2} / 10^{3}$	5.531	0.013	4.465	2.958	5.104	0.023	5.869	0.000
${(σ_{I}^{X})}^{2} / 10^{6}$	0.211	0.730	0.062	0.365	0.264	0.701	0.018	0.453
$ρ_{R}^{L F S, X}$	1.000	0.999	1.000	1.000	1.000	1.000	0.986	1.000
$ϕ$	0.577	0.723	0.539	0.770	0.577	0.723	0.539	0.770

Table 4.

Estimated Hyperparameters. Unemployed Persons.

Hyper-parameters	With allowance for time-varying variances for the disturbances of the slope components				Without allowance for time-varying variances for the disturbances of the slope components
Hyper-parameters	Male 15–24	Male 25–74	Female 15–24	Female 25–74	Male 15–24	Male 25–74	Female 15–24	Female 25–74
$ψ^{1}$	16	16	49	25	1	1	1	1
$ψ^{2}$	1.3125	2.25	5	4	1	1	1	1
${(σ_{R}^{L F S})}^{2} / 10^{6}$	0.004	0.105	0.001	0.036	0.033	0.411	0.000	0.355
${(σ_{ω}^{L F S})}^{2} / 10^{3}$	1.059	0.059	1.252	6.053	0.935	0.024	3.097	5.281
${(σ_{I}^{L F S})}^{2} / 10^{6}$	0.000	0.000	2.380	0.000	0.006	0.016	2.786	0.001
${(σ_{λ}^{L F S})}^{2}$	0.000	14.698	0.069	6.013	0.122	68.734	385.470	36.661
$σ_{e_{1}}^{2}$	1.224	1.579	1.309	1.830	1.221	1.585	1.317	1.836
$σ_{e}^{2}$	1.148	1.350	1.142	1.118	1.147	1.342	1.145	1.115
${(σ_{R}^{X})}^{2} / 10^{6}$	0.007	0.158	0.002	0.046	0.048	0.560	0.156	0.421
${(σ_{ω}^{X})}^{2} / 10^{3}$	0.003	0.003	0.000	0.006	0.024	0.316	0.007	0.203
${(σ_{I}^{X})}^{2} / 10^{6}$	0.000	0.000	0.000	0.000	0.001	0.374	0.000	0.232
$ρ_{R}^{L F S, X}$	0.991	1.000	0.999	1.000	0.998	1.000	0.957	1.000
$ϕ$	0.106	0.259	0.081	0.267	0.106	0.259	0.081	0.267

The estimate of $ψ^{1}$ is quite large for both employment and unemployment. In both cases, it ranges from 16 to 49 across the different domains. This implies that the variance of the disturbances related to the slope of the trend component is 16 to 49 times as high during the first part of the COVID-19 pandemic, as in the pre-pandemic period. For employment, the estimate of the scaling parameter $ψ^{2}$ ranges from 3 to 13 across the different domains. It is somewhat smaller for unemployment, ranging from about 1.3 to 5. In Part H of the Supplemental Documentation, we note, both in conjunction with employment and unemployment, that the estimates of the structural breaks are quite robust with respect to the different values of $ψ^{1}$ and $ψ^{2}$ considered in our grid.

For all the domains of employment, the estimate of the hyperparameter for the slope of the trend, ${(σ_{R}^{L F S})}^{2}$ , is lower when we allow for a time-varying variance than when we do not. This implies that the disturbance of the slope of the trend has a lower variance before 2020. With one exception, the same feature also appears for unemployment. For young women, the estimate of ${(σ_{R}^{L F S})}^{2}$ increases when allowing for time-varying variance. However, the estimated variance for this domain is low, and the estimate of the other hyperparameters increases when we allow for time-varying variance for the disturbance of the slope.

Tables 3 and 4 also reveal a high correlation between the disturbances of the slope of the LFS trend and the register trend. For all domains, the estimated correlation between the disturbances of the two trend-slopes is equal or very close to 1. This strong correlation is advantageous for estimating possible structural breaks due to the 2021 LFS-redesign. A correlation equal to 1 implies that the LFS-variables and the register variable follow a common stochastic trend and are thus cointegrated; see Engle and Granger (1987).

In the bottom line of Tables 3 and 4, we report the estimate of autocorrelation parameter of the survey errors, $\hat{ϕ}$ , for the different domains. We note that the estimated autocorrelation parameters are higher for employment than for unemployment for all domains. Due to the normally more stable labor market status over time for persons aged 25 to 74 than for persons aged 15 to 24, we see from Tables 3 and 4 that the estimated autocorrelation of the survey errors, $\hat{ϕ}$ , is higher for the oldest age group for both males and females for both employment and unemployment.

Diagnostics related to the behavior of the disturbances in the state vector can be derived from the one-step-ahead standardized prediction errors. (In Part E of our Supplemental Documentation, we report some diagnostics based on the one-step-ahead prediction errors and some graphs based on these errors and auxiliary residuals.) The standardized innovations seem to be reasonably well-behaved, although a part of the test diagnostics for some of the waves are significant at the 5% test level. With respect to the auxiliary residuals, Harvey and Koopman (1992) show these to be useful for detecting outliers and structural changes. Auxiliary residuals are smoothed estimates of the disturbances associated with the unobserved components. In the case with time-varying hyperparameters, the auxiliary residuals seem to perform better during the years 2020 and 2021 than in the case with time-invariant hyperparameters; see Part I in the Supplemental Documentation. We interpret this as evidence that our specification of the stepwise shifts in the variances of the disturbances of the slope components of the trends captures quite well the excess residual fluctuations in the last part of the sample that are present in the case with time-invariant hyperparameters.

Tables 3 and 4 also reveal that the estimated hyperparameters of the wave-specific effects are small. With such low values, imposing time-independent wave-specific effects (which occur when these hyperparameters are zero) would presumably have been innocent. Thus, in this case, the symmetric specification of the time-varying wave-specific effects is probably not that important in order to obtain a precise estimate of the structural break.

4.2. Structural Break Estimates

Table 5 reports the structural break estimates for employment in the four domains both when we allow for time-varying hyperparameters and when we do not. For comparison, we also report a column with the number of employed persons in 2019, the year before the COVID-19 pandemic, according to the LFS. When considering each wave separately, there is considerable uncertainty in the structural break estimates. Therefore, the domain-specific structural break estimates are given as the average of the estimates of the structural break parameters for the eight waves. The estimated total effect of the structural break in employment is 21,864 persons when we allow for time-varying hyperparameters and 24,307 when we do not. Measured relative to employment according to the LFS in 2019, the estimated structural break constitutes about 0.8 to 0.9% of employment and 0.6% of the population. From the table, we see that it is the structural break estimate for males aged 25 to 74 that is most affected by allowing for a time-varying hyperparameter: When assuming time-invariant hyperparameters, we obtain a structural break estimate for this domain of 5,002 persons, but this estimate changes to −381 when allowance is made for time-varying hyperparameters for the disturbances of the slopes. The structural break estimate for young females also changes when allowance is made for time-varying hyperparameters, from 5,442 to 8,115. The difference in the break estimate in the two specifications (with and without time-varying hyperparameters) is still relatively small when we compare it with the number of employees in these two domains. The difference in the break estimate for males aged 25 to 74 was 5,383, which corresponds to 0.4% of the number of employed in this domain in 2019. And the difference in the break estimate for females aged 15 to 24 corresponds to 1.6% of the number of employees in this domain.

Table 5.

Structural Break Estimates for Employed Persons, by Sex and Age.^a

Sex and age	Employment in 2019 (in thousand)	Optimal time-varying hyperparameters		Time-invariant hyperparameters
Sex and age	Employment in 2019 (in thousand)	Parameter estimate	Standard error	Parameter estimate	Standard error
Males aged 15–24	171	−1,608	2,252	−1,925	2,249
Males aged 25–74	1,271	−381	4,025	5,002	3,697
Females aged 15–24	163	8,115	2,143	5,442	2,389
Females aged 25–74	1,119	15,738	3,534	15,788	3,566
Total: aggregate of the four domains	2,724	21,864	6,295	24,307	6,095
Total for those aged 15–24: aggregate of the two sex groups	334	6,507	3,106	324	5,943
Total for those aged 25–74: aggregate of the two sex groups	2,390	15,357	5,311	20,790	5,137
Total for males: aggregate of the two age groups	1,442	−1,989	4,558	3,077	4,327
Total for females: aggregate of the two age groups	1,282	23,853	4,133	21,230	4,292

The period of the analysis is 2006M1-2021M10. The estimated uncertainties for the 2021-redesign level shift parameter estimates measured with the standard error are based on the assumption that $ψ^{1}$ , $ψ^{2}$ , and $ϕ$ are known. Employment in 2019, according to LFS.

When time-varying variances for the disturbances of the slopes are allowed for, the structural break estimates for males are small and insignificant when measured either individually or jointly. The estimates for women are all positive and significant. Thus, our analysis implies that the redesign of the Norwegian LFS led to an increase in measured employment for women.

Table 6 reports estimates of the structural break for unemployment in the four domains. For comparison, we also report a column with the number of unemployed persons in 2019 according to the LFS. When the hyperparameters of the disturbances of the two trend slopes are allowed to be time-varying, the total estimated effect of the structural break on unemployment figures is 5,371. This corresponds to 5.1% of the LFS unemployment in 2019, but just 0.1% of the population. When the hyperparameters related to the slopes are assumed to be time-invariant for all domains, the estimated total effect of the structural break is 7,841 persons, or 7.4% of the LFS unemployment in 2019 (and about 0.2% of the LFS population). The estimates for unemployed females are virtually unaltered by allowing for time-varying hyperparameters. Therefore, the change in the total structural break estimate when allowance is made for time-varying hyperparameters of the slopes is due to the change in the estimates for males. The overall structural break estimate for males is reduced by more than 2,000 persons (from 3,847 to 1,723) when time-varying variances are allowed for the disturbances of the trend slopes.

Table 6.

Structural Break Estimates for Unemployed Persons, by Sex and Age.^a

Sex and age	Unemployment in 2019 (in thousand)	Optimal time-varying hyper-parameters		Time-invariant hyper-parameters
Sex and age	Unemployment in 2019 (in thousand)	Parameter estimate	Standard error	Parameter estimate	Standard error
Males aged 15–24	20	3,275	1,670	4,799	1,475
Males aged 25–74	40	−1,552	2,346	−952	2,198
Females aged 15–24	17	4,940	1,442	5,163	1,342
Females aged 25–74	29	−1,292	1,737	−1,169	1,739
Total: aggregate of the four domains	106	5,371	3,659	7,841	3,440
Total for those aged 15–24: aggregate of the two sex groups	37	8,215	2,206	9,962	1,994
Total for those aged 25–74: aggregate of the two sex groups	69	−2,844	2,919	−2,121	2,803
Total for males: aggregate of the two age groups	60	1,723	2,880	3,847	2,647
Total for females: aggregate of the two age groups	46	3,648	2,258	3,994	2,197

The period of analysis is 2006M1-2021M10. The estimated uncertainties for the 2021-redesign level shift parameter estimates measured with the standard error are based on the assumption that $ψ^{1}$ , $ψ^{2}$ , and $ϕ$ are known. Unemployment in 2019, according to LFS.

Only ten months of observations of the LFS after the redesign may be too few to estimate the structural breaks precisely. In Part F of the Supplemental Documentation, recursive break estimates are therefore reported, that is, the break estimate with only one month of observation after the redesign, the break estimate with two months of data after the redesign, and so on. For most of the domains, it seems that the break estimate has converged. However, this is not the case for the break estimate for employment among young men, where the estimate is adjusted by over 2,000 persons when the figures for 2021M10 are included. The break estimate is nevertheless non-significant.

5. Conclusions

In 2021, the Norwegian LFS underwent a substantial redesign in accordance with the new regulation for integrated European social statistics. To ensure coherent labor market time series for the main indicators, the redesign’s impact is modeled. The estimated structural breaks can be used to adjust the previously published LFS series prior to 2021 such that they are comparable to the LFS figures after the redesign. The adjustment can be made by revising the historical figures corresponding to the estimated structural break, or by scaling the break with relative changes in the population for the individual domain. Statistics Norway has chosen the latter approach.

In this article, we have pursued a structural time series approach in the tradition of Pfeffermann (1991), Van den Brakel and Krieg (2009, 2015), and Elliott and Zong (2019). Structural breaks were estimated for the numbers of employed and unemployed persons in different domains.

We obtained a structural break estimate of about 22,000 employed and 5,000 unemployed persons aged 15 to 74 when allowing for time-varying hyperparameters for the disturbances of the slopes of the two trend variables. When no such allowance was made, the estimated breaks for employment and unemployment were about 2,000 to 3,000 higher. Both likelihood ratio tests and examination of the auxiliary residuals indicate that the hyperparameters related to the slopes are time-varying with higher values during the COVID-19 pandemic.

The structural break estimates identified for Norway are of the same sign as found in the Netherlands; see Van den Brakel (2022). However, our estimates are much smaller. Van den Brakel (2022) identifies a structural break estimate in employment that corresponds to more than 1.5% of the population in the LFS, and a structural break estimate in unemployment that exceeds 1% of the population. For Norway, the estimates of the structural break imply a positive shift in the employment figure of slightly less than 0.6% and for unemployment of just over 0.1%, measured in relation to the LFS population.

In the analysis, we have followed Van den Brakel and Krieg (2015) and Gonçalves et al. (2022), among others, by only allowing correlation between the disturbances of the slope of the trend component of the LFS variable and the trend component of the register variable. We could have extended this specification by also allowing for correlations between the disturbances of the seasonal components and between the disturbances of the irregular components; see, for example, Van den Brakel and Krieg (2016) for specifying such correlations between domains. Elliott and Zong (2019) allow for correlation between the disturbances of the three common components across the LFS and the register from the outset but end up with a specification which only includes a correlation between the disturbances of the two trend components.

Four different domains are considered in this analysis. However, we have not considered correlations between these domains. Allowing correlations between domains implies borrowing strength between domains. Pfeffermann and Burck (1990), Rao and Yu (1994), Datta et al. (1999), Pfeffermann and Tiller (2006), Boonstra and Van den Brakel (2022), among others, have considered such correlations between domains.

In this analysis, we assumed that only one break is taking place in all waves simultaneously. However, there could be delayed effects of the new sampling system as each quarter after the beginning of 2021, a new wave is included according to the new sample design. The first quarter where all waves have only been subject to the new sampling system is 2022Q4. We would have needed at least one more year of observations to analyze this possible delayed effect. Therefore, such effects were not feasible to estimate with our data set. However, investigating such delayed effects could be interesting for a follow-up analysis. The new LFS-design could also lead to a structural break in the seasonal pattern. To analyze such possible structural break in the seasonal pattern, longer data series would have been needed.

Supplemental Material

sj-pdf-1-jof-10.1177_0282423X241235267 – Supplemental material for Structural Break in the Norwegian Labor Force Survey Due to a Redesign During a Pandemic

Supplemental material, sj-pdf-1-jof-10.1177_0282423X241235267 for Structural Break in the Norwegian Labor Force Survey Due to a Redesign During a Pandemic by Håvard Hungnes, Terje Skjerpen, Jørn Ivar Hamre, Xiaoming Chen Jansen, Dinh Quang Pham, and Ole Sandvik in Journal of Official Statistics

Footnotes

6. Appendix

Acknowledgements

We would like to thank the associated editor and the referees for constructive comments and suggestions. Furthermore, we thank Susie Jentoft, Melike Oguz-Alper, Arvid Raknerud, and Ole Villund for valuable comments. Many thanks also to Jan van den Brakel who visited Statistics Norway in February 2020 and gave a course in time series analysis of survey redesigns. Any remaining errors and shortcomings are the sole responsibility of the authors.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Eurostat partially funded this article under grant agreement: 826605—2018-NO-LFS-QUALITY BREAKS.

Supplemental Material

Supplemental material for this article is available online.

Received: August 2022

Accepted: October 2023

References

Anderson

B. D. O.

Moore

J. B.

1979. Optimal Filtering. Englewood Cliffs, NJ: Prentice-Hall, Inc.

Bailar

B. A.

1975. “The Effects of Rotation Group Bias on Estimates from Panel Surveys.” Journal of the American Statistical Association 70 (349): 23–30. DOI: https://doi.org/10.1080/01621459.1975.10480255.

Binder

D. A.

Dick

1990. “A Method for the Analysis of Seasonal ARIMA Models.” Survey Methodology 16 (2): 239–53. Available at: https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X199000214533 (accessed March 2024).

Bollineni-Balabay

van den Brakel

J. A.

Palm

2016. “Multivariate State-Space Approach to Variance Reduction in Series with Level and Variance Breaks Due to Sampling Redesigns.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 179 (2): 377–402. DOI: https://doi.org/10.1111/rssa.12117.

Boonstra

H. J.

van den Brakel

J. A.

2022. “Multilevel Time Series Models for Small Area Estimation at Different Frequencies and Domain Levels.” Annals of Applied Statistics 16 (4): 2314–38. DOI: https://doi.org/10.1214/21-AOAS1592.

Datta

Lahiri

Maiti

1999. “Hierarchical Bayes Estimation of Unemployment Rates for the States of the U.S.” Journal of the American Statistical Association 94 (448): 1074–82. DOI: https://doi.org/10.1080/01621459.1999.10473860.

Durbin

Koopman

S. J.

2012. Time Series Analysis by State Space Methods. 2nd ed. Oxford: Oxford University Press.

Elliott

D. J.

Zong

2019. “Improving Timeliness and Accuracy of estimates from the UK Labour Force Survey.” Statistical Theory and Related Fields 3 (2): 186–98. DOI: https://doi.org/10.1080/24754269.2019.1676034.

Engle

R .F.

Granger

C. W. J.

1987. “Co-Integration and Error Correction: Representation, Estimation, and Testing.” Econometrica 55 (2): 251–76. DOI: https://doi.org/10.2307/1913236.

10.

Eurostat. 2022. Quality Report of the European Union Labour Force Survey 2020. Available at: https://ec.europa.eu/eurostat/documents/7870049/14455112/KS-FT-22-003-EN-N.pdf (accessed September 2023).

11.

Gonçalves, C., Hidalgo, L., Silva, D., and J. A. van den Brakel.

2022. “Single-Month Unemployment Rate Estimates for the Brazilian Labour Force Survey Using State-Space Models.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 185 (4): 1707–32. DOI: https://doi.org/10.1111/rssa.12914.

12.

Hansen

M. H.

Hurwitz

W. N.

Nisselson

Steinberg

1955. “The Redesign of the Census Current Population Survey.” Journal of the American Statistical Association 50 (271): 701–19. DOI: https://doi.org/10.1080/01621459.1955.10501962.

13.

Harrison

P. J.

Stevens

C. F.

1976. “Bayesian Forecasting.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 38 (3): 205–28. DOI: https://doi.org/10.1111/j.2517-6161.1976.tb01586.x.

14.

Harvey

A. C.

1989. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press.

15.

Harvey

A. C.

2006. “Seasonality and Unobserved Component Models: An Overview.” Conference on Seasonality, Seasonal Adjustment and Their Implications for Short-Term Analysis and Forecasting, Luxembourg, May 10–12. Available at: https://ec.europa.eu/eurostat/web/products-statistical-working-papers/-/KS-DT-06-019 (accessed September 2023).

16.

Harvey

A. C.

Chung

C. H.

2000. “Estimating the Underlying Change in Unemployment in the UK.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 163 (3): 303–39. DOI: https://doi.org/10.1111/1467-985X.00171.

17.

Harvey

A. C.

Durbin

1986. “The Effects of Seat Belt Legislation on British Road Casualties: A Case Study in Structural Time Series Modelling.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 149 (3): 187–210. DOI: https://doi.org/10.2307/2981553.

18.

Harvey

A. C.

Koopman

S. J.

1992. “Diagnostic Checking of Unobserved-Components Time Series Models.” Journal of Business and Economic Statistics 10 (4): 377–89. DOI: https://doi.org/10.1080/07350015.1992.10509913.

19.

Helske

2017. “KFAS: Exponential Family State Space Models in R.” Journal of Statistical Software 78 (10): 1–39. DOI: https://doi.org/10.18637/jss.v078.i10.

20.

Henningsen

Hamann

J. D.

2007. “Systemfit: A Package for Estimating Systems of Simultaneous Equations in R.” Journal of Statistical Software 23 (4): 1–40. DOI: https://doi.org/10.18637/jss.v023.i04.

21.

Hindrayanto

Aston

J. A. D.

Koopmans

S. J.

Ooms

2013. “Modelling Trigonometric Seasonal components for Monthly Economic Time Series.” Applied Economics 45 (21): 3024–34. DOI: https://doi.org/10.1080/00036846.2012.690937.

22.

Jentoft

2022. “The Norwegian Labour Force Survey Sampling Design.” Documents 2022/34, Statistics Norway. Available at: https://www.ssb.no/arbeid-og-lonn/sysselsetting/artikler/the-norwegian-labour-force-survey-sampling-design (accessed September 2023).

23.

Koopman

S. J.

1997. “Exact Initial Kalman Filtering and Smoothing for Nonstationary Time Series Models.” Journal of the American Statistical Association 92 (440): 1630–8. DOI: https://doi.org/10.1080/01621459.1997.10473685.

24.

Koopman

S. J.

Durbin

2000. “Fast Filtering and Smoothing for Multivariate State Space Models.” Journal of Time Series Analysis 21 (3): 281–96. DOI: https://doi.org/10.1111/1467-9892.00186.

25.

Koopman

S. J.

Durbin

2003. “Filtering and Smoothing of State Vector for State-Space Models.” Journal of Time Series Analysis 24(1): 85–98. DOI: https://doi.org/10.1111/1467-9892.00294.

26.

Krueger

A. B.

Mas

Niu

2017. “The Evolution of Rotation Group Bias: Will the Real Unemployment Rate Please Stand Up?” Review of Economics and Statistics 99 (2): 258–64. DOI: https://doi.org/10.1162/REST_a_00630.

27.

Kumar

Lee

1983. “Evaluation of Composite Estimation for the Canadian Labour Force Survey.” Survey Methodology 9 (2): 178–201.

28.

Nguyen

N. D.

Zhang

L.-C.

2020. “An Appraisal of Common Reweighting Methods for Nonresponse in Household Surveys Based on the Norwegian Labour Force Survey and the Statistics on Income and Living Conditions Survey.” Journal of Official Statistics 36 (1): 151–72. DOI: https://doi.org/10.2478/jos-2020-0008.

29.

Oguz-Alper

2018. New Estimation Methodology for the Norwegian LFS. Documents 2018/16, Statistics Norway. Available at: https://www.ssb.no/en/arbeid-og-lonn/artikler-og-publikasjoner/new-estimation-methodology-for-the-norwegian-labour-force-survey (accessed September 2023).

30.

Oguz-Alper

2023. Weighting methodology for the Norwegian Labour Force Survey from 2021 onwards. Documents 2023/51, Statistics Norway. https://www.ssb.no/en/arbeid-og-lonn/sysselsetting/artikler/weighting-methodology-for-the-norwegian-labour-force-survey-from-2021-onwards (accessed March 2024).

31.

Pfeffermann

1991. “Estimation and Seasonal Adjustment of Population Means Using Data from Repeated Surveys.” Journal of Business and Economic Statistics 9 (2): 163–75. DOI: https://doi.org/10.1080/07350015.1991.10509840.

32.

Pfeffermann

2022. “Time Series Modelling of Repeated Survey Data for Estimation of Finite Population Parameters.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 185 (4): 1757–77. DOI: https://doi.org/10.1111/rssa.12950.

33.

Pfeffermann

Burck

1990. “Robust Small Area Estimation Combining Time Series and Cross-Sectional Data.” Survey Methodology 16 (2): 217–37. Available at: https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X199000214534 (accessed March 2024).

34.

Pfeffermann

Feder

Signorelli

1998: “Estimation of Autocorrelations of Survey Errors with Application to Trend Estimation in Small Areas.” Journal of Business and Economic Statistics 16 (3): 339–48. DOI: https://doi.org/10.1080/07350015.1998.10524773.

35.

Pfeffermann

Tiller

2006. “Small-Area Estimation with State-Space Models Subject to Benchmark Constraints.” Journal of the American Statistical Association 101 (476): 1387–97. DOI: https://doi.org/10.1198/016214506000000591.

36.

Proietti

2000. “Comparing Seasonal Components for Structural Time Series Models.” International Journal of Forecasting 16 (2): 247–60. DOI: https://doi.org/10.1016/S0169-2070(00)00037-6.

37.

Rao

J. N. K.

1994. “Small-Area Estimation by Combining Time-Series and Cross-Sectional Data.” Canadian Journal of Statistics 22 (4): 511–28. DOI: https://doi.org/10.2307/3315407

38.

Schiavoni

Palm

Smeekes

van den Brakel

J. A.

2021. “A Dynamic Factor Model Approach to Incorporate Big Data in State Space Models for Official Statistics.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 184 (1): 324–53. DOI: https://doi.org/10.1111/rssa.12626.

39.

Scott

A. J.

Smith

T. M. F.

1974. “Analysis of Repeated Surveys Using Time Series Methods.” Journal of the American Statistical Association 69 (347): 674–8. DOI: https://doi.org/10.1080/01621459.1974.10480187.

40.

Scott

A. J.

Smith

T. M. F.

Jones

R. G.

1977. “The Application of Time Series Methods to the Analysis of Repeated Surveys.” International Statistical Review 45 (1): 13–28. DOI: https://doi.org/10.2307/1403000.

41.

Srivastava

V. K.

Dwivedi

T. D.

1979. “Estimation of Seemingly Unrelated Regression Equations: A Brief Survey.” Journal of Econometrics 10 (1): 15–32. DOI: https://doi.org/10.1016/0304-4076(79)90061-7.

42.

Stephan

F. F.

Frankel

L. R.

Teper

1954. The Measurement of Employment and Unemployment by the Bureau of the Census in Its Current Population Survey. Report of the Special Advisory Committee on Employment Statistics. Available at: https://books.google.no/books/about/The_Measurement_of_Employment_and_Unempl.html?id=uWfuRtR9c60C (accessed September 2023).

43.

Tiller

R. B.

1992. “Time Series Modelling of Sample Survey Data from the U.S. Current Population Survey.” Journal of Official Statistics 8 (2): 149–66. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/time-series-modeling-of-sample-survey-data-from-the-u.s.-current-population-survey.pdf (accessed March 2024).

44.

Van den Brakel

J. A.

2022. “Monthly Labour Force Figures During the 2021 Redesign of the Dutch Labour Force Survey.” Discussion Paper, Statistics Netherlands. https://www.cbs.nl/-/media/_pdf/2022/03/lfs-redesign-2021.pdf (accessed September 2023).

45.

Van den Brakel

J. A.

Krieg

2009. “Estimation of the Monthly Unemployment Rate Through Structural Time Series Modelling in a Rotating Panel Design.” Survey Methodology 35 (2):177–90. https://www150.statcan.gc.ca/n1/pub/12-001-x/2009002/article/11040-eng.pdf (accessed September 2023).

46.

Van den Brakel

J. A.

Krieg

2015. “Dealing with Small Sample Sizes, Rotation Group Bias and Discontinuities in a Rotating Panel Design.” Survey Methodology 41 (2): 267–96. Available at: https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201500214231 (accessed March 2024).

47.

Van den Brakel

J. A.

Krieg

2016. “Small Area Estimation with State-Space Common Factor Models for Rotating Panels.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 179 (3): 763–91. DOI: https://doi.org/10.1111/rssa.12158.

48.

Van den Brakel

J. A.

Michiels

2021. “Nowcasting Register Labour Force Participation Rates in Municipal Districts Using Survey Data.” Journal of Official Statistics 37 (4): 1009–45. DOI: https://doi.org/10.2478/jos-2021-0043.

49.

Van den Brakel

J. A.

Roels

2010. “Intervention Analysis with State-Space Models to Estimate Discontinuities Due to a Survey Redesign.” Annals of Applied Statistics 4 (2): 1105–38. DOI: https://doi.org/10.1214/09-AOAS305.

50.

Van den Brakel

J. A.

Smith

P.A.

Compton

2008. “Quality Procedures for Survey Transitions—Experiments, Time Series and Discontinuities.” Survey Research Methods 2 (3): 123–41. DOI: https://doi.org/10.18148/srm/2008.v2i3.68.

51.

Van den Brakel

J. A.

Souren

Krieg

2022. “Estimating Monthly Labour Force Figures During the COVID-19 Pandemic in the Netherlands.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 185 (4): 1560–83. DOI: https://doi.org/10.1111/rssa.12869.

52.

Van den Brakel

J. A.

Zhang

Tam

S.-M.

2020. “Measuring Discontinuities in Time Series Obtained with Repeated Sample Surveys.” International Statistical Review 88 (1): 155–75. DOI: https://doi.org/10.1111/insr.12347.

53.

Zhang

L.-C.

Thomsen

Kleven

Ø.

2013. “On the Use of Auxiliary and Paradata for Dealing with Non-Sampling Errors in Household Surveys.” International Statistical Review 81 (2): 270–88. DOI: https://doi.org/10.1111/insr.12009.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

2.90 MB