Abstract
The Gulf Cooperation Council is a regional cooperation of six Middle Eastern countries—Saudi Arabia, Kuwait, the United Arab Emirates, Qatar, Bahrain, and Oman. A common feature of these countries is the existence of many group quarters, usually called labor camps, a term used to refer to housing accommodations for unskilled migrants where nonrelated people live together. The camp size ranges from a few people to a few thousand people from many different countries who speak dozens of languages. Also, the camp size and the composition of residents inside the camps change relatively quickly as people move in and out of the camps as their labor contracts expire or project needs change. This article presents one way to subsample this dynamic population inside such labor camps. The technique was used in one survey conducted in Qatar, where more than half of the country’s population resides in labor camps.
Introduction
The Gulf Cooperation Council (GCC), established in 1981, is a regional cooperation of six Middle Eastern countries. Its member countries are Bahrain, Kuwait, Oman, Qatar, Saudi Arabia, and the United Arab Emirates. In the last three decades, there has been a large influx of migrants into these countries in response to the increase in the price of oil and the subsequent plans of these countries for rapid development. These plans require bringing in a very large number of foreign workers since the indigenous labor forces are small and do not have the variety of skills required for the development of infrastructure and other projects (Dito 2010). According to recent statistics, migrants outnumber nationals in terms of the labor force in all GCC countries. Migrants also outnumber nationals in terms of population in four of the six countries (Baldwin-Edwards 2011).
A representative survey that studies the living and working conditions, as well as the attitudes and opinions of the population in these countries, would have to take into account these migrants. However, the special housing arrangement for the migrants in these countries poses some issues about the sample design of the survey. While some migrants live in ordinary household units that can be sampled by the common household sample design, many migrants live in group quarters (GQs), usually called labor camps, which may require a different sample design.
These camps are usually provided by employers and are concentrated at certain places away from ordinary residential household units. Inside the camp, the number of migrants varies significantly, ranging from a few to a few thousands, and they come from various countries and speak different languages. Due to financial and legal reasons, these migrants cannot bring their family with them to any country in the GCC, so there is usually no household unit in the camp. Unrelated migrants share rooms, and people from the same country tend to live in the same or adjacent rooms.
As migrants come to GCC countries with two- or three-year labor contracts, the camp size and the composition of residents inside the camps change relatively quickly. People move in and out of the camps every month when their labor contracts expire or when they follow construction projects to a new place. Thus, there is rarely an updated list of people in the camp; instead, the sample listing inside the camp has to be conducted during the fieldwork.
A large number of people from a variety of countries speaking different languages dictates that the selection of persons to be interviewed inside the camp plays an important part in the sampling process. If interviewers are allowed to select respondents in the camp, they would select persons who live in the same room and close to the entrance gate for convenience. More importantly, interviewers would only select migrants who speak the same language as his or her own language; this gives zero chance of selection for those who speak a language different from the one spoken by the interviewer.
In this study, we first discuss sampling methods used in previous GQ surveys conducted in the United States and GCC countries. Based on these existing methods and our knowledge of the camps in Qatar, we present a method to sample migrants inside labor camps. The sampling was then used in one labor camp survey in Qatar in 2014. We conclude the study with a discussion of the results and implications for future studies on this topic.
GQ Sampling in Previous Surveys
Survey sampling of populations in GQ is generally designed with two or more stages of sampling: sampling of GQ and then subsampling of residents within a GQ. The reasons for multistage sampling are twofold. First, most often there is not a complete sampling frame of residents in the GQ. Instead, the project begins with only a list of GQ with perhaps some auxiliary information such as estimated population within the GQ. The time and cost of collecting resident-level information from each GQ to do a direct sampling method would be very burdensome. Second, it is more cost and time efficient to sample more residents within a few GQs instead of sampling fewer residents within many GQs, due to travel and administration barriers. Of course, higher sampling rates within a GQ increases variance estimation levels, but proper sample design can strike a good balance between reducing interview costs while maintaining acceptable levels of variance (Kish 1965).
The American Community Survey (ACS 2010), which has been conducting GQ interviews since 2006, draws a sample of GQs each month from a GQ-only frame. Examples of GQ included in the frame are college residence dormitories, residential treatment centers, nursing homes, military barracks, correctional facilities, and homeless shelters. The list of quarters is stratified by population size: small facilities of 15 residents or less and large facilities of more than 15. GQs selected for the sample in the small strata are selected with equal probability, and the subsampling rate within each facility is one (interview all residents). The large stratum is list ordered by facility type and then by geographic location. The facilities are represented once in the list for each group of 10 residents they house. For example, a facility with 480 residents will be listed 48 times for selection. This method implies that more than one group of 10 can be randomly chosen from a facility.
In the ACS, the subsampling of people inside GQ facilities is implemented in the field with two visits. First, a field representative visits the facility and obtains the resident list from a GQ representative. Then, he or she identifies the residents to be interviewed through a systematic sampling procedure, using the current resident list, a predetermined interview count (all in quarters, or 10 times the number of first-stage selections as mentioned above), and a computer-generated random start. An interviewer conducts the second visit to the GQ to administer the survey to the preselected residents. If a roster of residents is not available, then the field representative asks for bed locations and creates a listing from this.
In some surveys, where the GQ has a large population, interviewing a random selection of residents can be burdensome. For this reason, an entire cluster of respondents is usually interviewed at once. Surveys of American schools, such as the Monitoring the Future study (http://monitoringthefuture.org) or the Youth Risk Behavior Surveillance System (http://www.cdc.gov/healthyyouth/yrbs), employ a three-stage sampling design: (1) geographic area; (2) school; and (3) classroom, where the final stage of sampling selects the classrooms within the schools. All students in the class are selected for interviews. Disrupting fewer classrooms for survey administration reduces the burden on both the study team and the school’s administration.
In a few cases, traditional subsampling is hindered when the GQ cannot provide an accurate list of residents nor an accurate number of occupied beds. This challenge, in some regard, is similar to the challenge of sampling households in a large geographic range. Both situations have a finite geographic area within which the potential respondents dwell, but an accurate, up-to-date list of elements (residents or households) is not available. The effort to list all the elements within the area would be too excessive. Instead, in the case of the household sampling, designs often break the geographic area into smaller areas called segments or blocks and sample from them. This is known as creating an area frame, where the frame is a grid of geographic segments that cover the entire geographic area of interest. Sampling segments or blocks, before listing and subsampling households within those areas, not only eliminates the need to identify all the elements in an area frame but also reduces the travel costs for interviewers conducting face-to-face interviews.
In GCC countries, there are several surveys on population living inside the camps. However, there are few publicly available documents about the sample designs for these surveys. Through informal channels, we can only obtain the sample design used in Qatar by the Qatar Statistical Authority (QSA). In its 2012 Labor Force Survey, QSA used different sampling procedures for small camps (six persons or less) and large camps (more than six persons). The small camp sample was chosen using a two-stage probability proportion to estimate size design, where the primary sampling units (PSUs) were created by combining adjacent census blocks. The PSUs had an average of 60 small camps each, and 22 camps were selected for interview from each selected PSU. The large camp sample was chosen using a stratified two-stage sampling process. In this case, the PSUs were the individual camps, and they were stratified into three groups: estimated size seven–500 residents, 501–2,500 residents, and more than 2,500 residents. The PSUs in stratum 1 (seven–500 residents) were selected with probability proportion to size and then five persons were selected in each camp for interview. For strata 2 and 3, all camps were selected with certainty (195 in total). For these two strata, 25 persons were sampled from each camp in stratum 2, and 50 persons were sampled from each camp in stratum 3. The documents from QSA did not specify the subsampling procedures—how the camp residents were randomly selected for interview (QSA 2012). 1
In the following, we present our sample design for a labor camp survey conducted in Qatar in 2014. The design is based on previous designs and our knowledge of the labor camp structure in Qatar. We especially focus on the subsampling of respondents inside the labor camps. The large number of people from a variety of countries speaking different languages would obviously complicate the subsampling process. In addition, as people move in and out of the camps frequently, the camp size and the composition of residents in the camps change quickly, thus a roster of residents or beds (used in the ACS subsampling) is usually not available. We will try to address these issues in our sampling.
Sample Design
Qatar is the richest country in the GCC in terms of Gross Domestic Product (GDP) per capita. Qatar is also the country with the highest dependence on migrants in terms of the labor force. According to the latest census in 2010, migrants account for more than 95% of the labor force and about 90% of the total population. Of these migrants, about 30% live in ordinary household units, while the 70% live in the labor camps. This means 63% of the total population in Qatar live in labor camps. Since the labor camp migrant population represents such a large proportion of the country’s population, it is essential that the sampling design for this population be precise and unbiased in its estimates, while keeping data collection costs to a reasonable level.
In this design, the sampling frame of labor camps, provided by the sole water and electricity company in Qatar, is stratified by the size of camps, and then the selection of respondents is based on two-stage process. First, the labor camps, or PSUs, in each stratum are randomly selected with probability proportionate to their size (PPS). The number of residents sampled per camp is uniform within strata but varies across them—larger clusters are selected from larger camps. The second stage of sample selection is the subsampling of people inside the camp with two visits. Each stage is described in more detail below.
Stage 1: Labor Camp Sampling
Our sample is drawn from a frame that was developed by the Social and Economic Survey Research Institute, with assistance from the water and electricity company, Kahramaa, the only company providing water and electricity services in Qatar. In this frame, all labor camps are listed with information about the address and the number of persons living inside. Table 1 presents the number of labor camps by municipalities in this frame. The table shows that there is a large number of camps located in Doha, the capital of Qatar, with a good number of ongoing construction projects related to Qatar hosting the Fédération Internationale de Football Association (FIFA) World Cup in 2022.
Number of Labor Camps by Municipalities in the Frame.
Following the QSA sampling procedure, the frame is divided into strata based on size, as presented in Table 2. 2 However, the size categories differ from the QSA groupings as we opt to separate the small camps more finely and lump more of the larger camps together in one stratum. According to Table 2, the very small stratum with less than seven persons in each camp accounts for 5.3% of the migrant population. Meanwhile, the very large stratum with 200 persons or more makes up 36.5% of the migrant population. We use proportionate allocation to ensure that these proportions in the frame will be adhered in the sample. The benefit of stratification is to increase the precision of statistical estimates (i.e., a decrease in the standard error); the larger the difference between strata on demographic characteristics and variables of interest, the larger the increase in precision. It is expected that the characteristics of people are likely to vary based on camp size. People in larger camps usually have lower income and lower education than those in smaller camps. This expectation will be verified later in the Survey Results section.
Number of Camps and Persons by Strata.
The last column of Table 2 shows the number of persons to be selected in each camp for different strata. We selected one person for the very small type, two persons for the small stratum, and so on. The decision to sample more persons in larger camps is based on the expectation that larger camps have more variation (or lower correlation) within their population, as opposed to smaller camps. The additional interviews should capture the increased level of variation. We will show the variation across strata in some key variables in the Result section.
Having stratified the frame, the camps within each stratum can be selected with PPS. 3 Considering that there are fixed numbers of people to be selected in each stratum, the PPS method helps equalize the chance of selection of labor migrants in each stratum as well as in the whole sample due to the proportionate allocation across strata. In other words, the data are self-weighted, and there is no need to calculate the sampling weights. However, the camp size changes so quickly that the actual camp size collected during the fieldwork sometimes differs from the one in the frame. For example, a camp, which is considered small in the frame, selected through the PPS method can be found to have significantly increased in size by the time of data collection. For this camp, a sampling weight is required to offset the increasing camp size. The opposite problem occurs for very large camps that are found to have shrunk in size. Therefore, sampling weights are needed to account for the changing camp size. 4
Stage 2: Subsampling Inside the Camps
As mentioned above, the subsampling of people inside the camp is an important part in the sampling process due to the large number of people from various countries speaking different languages. In the following, we describe the sampling method in general, followed by the specific steps used in the field.
Sampling method
In the ACS (2010), the subsampling inside GQ is made easier by the list of resident names living inside the quarters. However, in our labor camp survey, this list is usually not available. Furthermore, the number of residents inside the camp changes quickly, preventing the camp from tracking which and how many beds are occupied on any particular day. To tackle these issues, we take inspiration from the sampling procedures of the American school surveys and area-based frames. Instead of conducting a full listing of the camp population with potentially thousands of residents, we introduce an intermediary sampling stage—the room. The following describes the selection of the room and then bed numbers inside the camp.
First, the selection of rooms is conducted with circular systematic sampling. Systematic sampling procedure stipulates that rooms are chosen by taking every kth room in the camp, where k is called the sampling step (the ratio between the number of rooms in the camp and the number of rooms to be selected, rounded to nearest whole number). For instance, if there are 13 rooms in a camp and four rooms need to be selected, then the sampling step to be used is the whole number part of 13/4, which is three. Next, a random number from one to 13 is generated, say number five. The selected room numbers are five, eight, 11, and one. As labor migrants from the same country tend to live in adjacent rooms, the selection of rooms by systematic sampling helps reduce the chance of selecting people from one country, hence increasing the variation in sampled people’s characteristics.
This step mimics the area-based sampling method of creating segments or blocks. Rooms are permanent, clear divides of the camp population, and rooms are assumed to house approximately the same number of migrants within a camp. Plus, each migrant is assigned one and only one room. This is similar to the aim of drawing area segments with recognizable, permanent boundaries and with approximately the same number of households in each.
Second, one person in each room is randomly selected by his bed number. For example, if there are 10 occupied beds in the room (do not include empty beds in list), the computer will randomly select one number from 1 to 10, say 4. Then, the person in bed number four is selected for the survey. An alternate way to select a person in the room is to ask for the name of everyone in the room. However, this method is very time consuming as some rooms can have dozens of people inside.
Sampling in the field
The sampling method described above is based on information on the number and location of rooms as well as the bed number. However, this information is not available in the frame, so the selection of rooms and the person inside the room has to be done during the fieldwork in two visits as follows.
First, a supervisor (with a computer) is sent to the selected camp. On arrival, he asks for the number of occupied rooms in the camp. Then, the computer (using systematic sampling) shows the room numbers to be selected. Since there are not usually room numbers in the camp, the supervisor is instructed to count rooms from left to right, starting from the room closest to the main entrance gate. Having selected the rooms, the supervisor asks for the number of occupied beds in the selected rooms, and the computer randomly selects a number from one to the number of the beds. Like room numbers, there are no bed numbers in the rooms, so supervisors count the beds from left to right and select the bed with the number generated by the computer. Next, the supervisor asks for the name and language spoken by the person of the selected bed. 5 Note that he can do this with anyone who is available in the room, not necessarily with the selected person. The supervisor then leaves the camp without interviewing the selected person. Before leaving, he puts a sticker on the doors of selected rooms.
Second, interviewers with the appropriate language skills are assigned to visit the camp to conduct the interviews with the selected persons in the camp. The interviewers locate the selected rooms in the camp with the stickers and then conduct the interview with the selected person in the room.
The main reason for the two visits to the camp (one by the supervisor and one by the interviewer) is to resolve the language issue. Without information about the language of the selected persons, we would not be able to send the right interviewer(s)—interviewer(s) with the proper language skills to conduct the interview(s)—to the camp. The quality of the data could be hampered if interviewers and respondents do not fully understand each other due to language differences. Another reason for the two visits is about the gatekeeper issue. Having a supervisor who is better trained and more experienced is sometimes necessary to gain access to the camps. Overall, the two visits increase the field cost but are needed to ensure the survey quality.
Survey Results
The 2014 Omnibus Survey of Qatar implemented the sampling method described in this article to select a sample of migrants living in labor camps. Table 3 shows the number of camps and respondents interviewed in each stratum. A total of 645 respondents from 133 labor camps were interviewed. The last column shows the proportion of respondents across strata. Approximately, one-fifth of the population lives in a very small or small camp; another one-fifth lives in a medium-sized camp; one-quarter lives in a large camp; and over one-third of the population lives in a very large camp. These proportions are similar to those in the frame (see Table 2), as a result of the proportionate allocation to strata.
Distribution of Camps and Respondents across Strata.
Camp Size Change from One Stratum to Another Stratum.
Camp Size: Frame Information and Field Observation
The sampling frame provided by Kahramaa includes estimates of the number of migrant workers living in each labor camp. The estimates often do not match the numbers reported by the camp representatives when the field supervisor conducts the first visit to the camps. Camp sizes change rapidly due to current project needs and workers constantly arriving from and leaving for their home countries. This results in some camps being miscategorized in the strata. In some instances, a camp that had been placed in the stratum for small-sized camps (seven–15 residents), based on initial size information from the sample frame, may actually have more than 15 residents when the field supervisor first visits the camp for the survey. This means the camp should have been placed in a different stratum if information about the true camp size was known during sample design. Table 4 shows how often this situation occurred in the 2014 Omnibus Survey fieldwork. The numbers in the diagonal show the number of camps with no change from the frame to the field, while the numbers off the diagonal show the difference between the frame and the field. For example, 17 camps were placed in the stratum for very small camps; 11 of the 17 camps were found to actually have two to six residents. However, five labor camps were found to have increased in size to have between seven and 19 residents, and one camp had increased to have 20–49 residents. For each stratum, we do observe a large proportion of camps increasing or decreasing their numbers to the extent that they are changing their camp size stratum.
Migrant Worker Demographic Characteristics by Strata
The extent to which camps are miscategorized into strata, as demonstrated by the previous table, leads us to question whether the stratification process is still worthwhile. Note that the main goal of stratification is to increase precision of the estimate, and this goal can only be achieved if there is significant difference in population characteristics across strata. Table 5 provides evidence that the respondents in each strata are significantly different from each other on several demographic characteristics.
Demographic Differences across Strata.
Note: ANOVA = analysis of variance.
Respondents in strata 1 and 2 (very small and small camps, as estimated in the frame) are generally older, by four to five years, than respondents in other strata. The respondents in stratum 1 are more likely to have completed some postsecondary education, compared to others in strata 2–5. In general, respondents’ level of income decreases from stratum 1 to stratum 5. Marriage rates of respondents differed across strata but not in a linear pattern like the other characteristics. We use analysis of variance to test for differences across strata. The p values of the tests are presented in the last row. Marital status is statistically significant at the 5% level, while other demographics (age, education, and income) are all significant at 1% level. These data suggest that the flawed stratification is still useful in the sampling process.
Respondent Nationality
Table 6 displays the tabulations of respondents’ home countries. One-third (33%) of the sample is from Nepal. India is the second most common home country (28.5%). Approximately, one in six labor migrants (15.5%) is from Bangladesh. Other common home countries are Sri Lanka (5.9%), Egypt (4.5%), Pakistan (4.1%), and the Philippines (3.0%). The variety of nationalities shows the importance of matching interviewer’s language to respondent’s language. This justifies the use of two visits during the fieldwork whereby the respondent’s language is identified in the first visit, and the interviewer with the right language can be selected for the second visit.
Respondent Distribution by Nationalities.
Camp and Respondent Response Rates by Strata
Camps had an overall response rate of 83%, and once inside a cooperating camp, selected residents responded overall at a rate of 97%. Overall and stratum-specific response rates are reported in Table 7. The response rates for camps were highest for very small and small camps (93% and 92%, respectively), while the medium, large, and very large camps responded at a lower rate of 72%, 76%, and 75%, respectively. Respondent response rates were very high for all groups, with the lowest response rate of 94%, from the very large camp.
Response Rates by Strata.
Intra-camp Correlation
In our sample design, the number of selected persons in each camp varies across strata. For example, in the very small stratum, only one person is selected from each camp, while 16 are chosen in the very large stratum. The justification for this difference is based on our expectation that there is more variation in the big camps than the small camps. To check this expectation, we look at the intra-cluster correlation (ICC) coefficients across strata for some demographics and key variables of interest (see Table 8).
Intra-cluster Correlation Coefficients.
For age of respondent, the ICC for each stratum is relatively small (ρ = .08, .25, .03, .07). For level of education of respondents, the ICC decreased in value from stratum 2 to stratum 5. The level of variation in education level within a camp is greater in large camps than in small, which is what the design team assumed when previously determining subsampling rates. The ICC for respondents’ level of income followed a similar pattern to age, in that the ICC for each stratum was generally low (ρ = .17, .28, .10, .17), while the overall ICC was higher at 0.47. Thus, demographic variables tend to follow one of two patterns: Either the ICC is high among smaller labor camps and decreases with labor camp size (in the case of education leve1); or the ICC is relatively stable within any strata, but intra-strata correlation is evident (in the cases of age and income). Table 8 includes ICC values for two variables of substantive interest. Job satisfaction and work treatment satisfaction (both a scale of 1–5 with 5 = “very satisfied”) followed the first pattern described above. The ICC was higher in the smaller camps, and the coefficients decrease in value as the camp size grows.
Discussion
Migrant workers residing in labor camps represent a large and growing segment of the population in Qatar. In light of the criticism about the living and working conditions for the 2022 World Cup construction workers, the demand to study and understand the population has also grown. It is a unique population to sample and interview because they are very diverse ethnically and linguistically as well as rapidly changing in size and location.
This article presents one way to sample migrant laborers using a stratified two-stage selection approach. The camps, or PSUs, are proportionately stratified by camp size, then the selection of camps is conducted with the probability proportion to size method. Subsampling selection is conducted in the field by a supervisor since updated lists of camp residents often do not exist. The supervisor systematically selects room(s) in the camp and then systematically selects one occupied bed in the room(s). An interviewer with the needed language skills visits the camp at another time to conduct the interview with the chosen respondent.
Stratification is valuable, although camps may drastically change in size between the time the size is recorded in the frame and when the field team conducts the interviews. However, the changes do create problems elsewhere in the sampling process. Foremost of these is that sampling weights are negatively impacted. A camp, believed to be small, selected through the PPS method and then found to have doubled or tripled in size by the time of data collection will yield an extremely large sampling weight; the opposite problem occurs for very large camps that are found to have rapidly shrunk in size. The extremely high and extremely low weight values inflate the survey’s variance estimates. Although weight trimming helps mitigate this problem, 6 more work needs to be done on how to obtain more accurate camp sizes for the frame or find methods to mitigate the effects of the rapid changes.
The analysis of ICC overall and within each stratum revealed two relationship patterns. First, just as the design team had estimated, the ICC values are highest in the very small camp strata, and the values gradually decrease as the strata’s average camp size increases. When the ICC values change across strata, an efficient sample design will vary the number of elements selected in each cluster, just as the design does now.
Most subsampling procedures require a full listing of elements. The proposed sample design approach takes methods from multiple-stage sampling designs and area listing frames to subsample within labor camps lacking up-to-date resident lists. Choosing rooms in a camp with a systematic random procedure overcomes the problem while maintaining the ability to select a subsample as diverse as a simple random subsample.
We hope that this proposed sampling design and its criticisms previously mentioned will add to the discussion of unique challenges in sampling diverse and dynamic populations in GQs such as the migrant workers residing in labor camps in Qatar or in other GCC countries.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
