Module 7. Selecting households and participants

General principles of mapping and listing

Definitions

How to segment primary sampling units (PSUs)

How to select households in a cluster

Identifying and selecting eligible individuals in the household

This module provides examples and information on sampling. It is essential that an experienced sampling statistician be included on the team to develop and implement the sampling plan, and to undertake quality control measures as the survey progresses

Definitions

Clear definitions of survey-related terms are essential because they may vary from country to country. In particular, the terms “household” and “dwelling” must be well defined to ensure that survey teams operate consistently when identifying and selecting households in the field. The Demographic and Health Survey (DHS) and the Multiple Indicator Cluster Survey (MICS) are good resources to check for meanings that are culturally appropriate, especially in societies where the family structure and marital arrangements (for example, polygamy) may add complexity to definitions.

This manual uses the following definitions:1

  • A household consists of a person or a group of persons who live together in the same dwelling unit, who share common living arrangements, who eat together, who acknowledge the same person as the head of household, and who are considered as one unit.
  • The head of household is a resident member of the household who is acknowledged by the other members as the primary decision-maker.
  • A dwelling unit is a room, or a group of rooms, normally intended as a place of residence for one household. This could be a single house, an apartment, or a group of rooms in a house. However, a dwelling unit can also be shared by more than one household.
  • A structure is a free-standing building that can have one or more dwellings for residential or commercial use. A residential structure can have one or more dwelling units, for example, it may be a single house or an apartment building.
  1. Definitions from: MICS manual for mapping and household listing (25 February 2019). New York: United Nations Children’s Fund; 2019 

General principles of mapping and listing

After having selected the clusters, the next steps are to map them and to list, select and identify households to visit for data collection. This process is essential to ensure the random selection of the survey sample, and must be included in the survey budget and timeline despite its significant field expense.

Mapping and listing can be done slightly in advance of the survey, by specialist teams that do not include survey fieldwork team members. Alternatively, they can be done immediately before data collection by the survey teams themselves. Advance mapping and household listing is the recommended option, as it has been found to be more reliable. The national statistics office usually has staff with training and experience in mapping and household listing.

Some countries have electronically available maps, in other countries the maps are hand-drawn. Maps may or may not be up to date. The household listing team needs to establish the level of accuracy before beginning segmentation or household listing from any pre-existing maps. If the available information is more than one year old, a new household listing should be conducted.

The household listing team usually consists of two to three listers. Supervisors generally oversee several teams.

The listing team should identify the physical boundaries of the PSU, commonly with the help of a local guide. If the PSU is large it may require segmentation (see below) before the team can begin to map and list. Any problems encountered during the mapping and listing process should be communicated to the supervisor.

The materials needed for the household listing activity include:

  • A manual describing all procedures for mapping and household listing
  • Felt-tipped pens (alternatively, marker or chalk) to be used in numbering structures
  • A notebook
  • Pencils and erasers
  • Maps of the selected clusters
  • A cluster information form
  • A household listing guide and form
  • A segmentation form

How to segment primary sampling units (PSUs)

Selected PSUs can be made up of one single or multiple clusters. PSUs may vary widely in the number of households or population size. It would not be cost- or time-efficient to conduct a full listing of all households in large PSUs. Instead, it would be more useful to subdivide them into smaller geographic areas, called segments (sometimes referred to as quadrants). Only one of these segments will be selected as the cluster eligible for sampling, then it will be mapped and listed. Box 7.1 presents the main rationale for PSU segmentation.

Box 7.1. Rationale for PSU Segmentation

Large PSUs may need to be segmented to accommodate numerous clusters within a single PSU (an example is provided in Box 5.1). When PSUs are split into discrete segments, it is important to adopt segment boundaries that are easily identifiable. If clearly identifiable boundaries are not present, it may not be appropriate to subdivide the area.

Segmentation of large PSUs into multiple clusters is also advantageous for survey logistics. A micronutrient survey generally requires the transport of a large amount of equipment, for conducting anthropometric measurements and for collecting and storing samples and specimens. In some surveys, a central laboratory is set up in each cluster, and eligible participants are asked to go to this location for measurement and for collection of samples or specimens. It therefore makes sense to have smaller cluster sizes. Ideally the cluster should not contain more than three to five times the number of households that need to be selected. Thus, if 30 households are needed, a range of 90–150 households in the cluster is ideal. Larger clusters will have more variation in outcomes, increasing the intra-cluster correlation and the design effect.

Upon arrival in a large PSU that may need segmentation, the listing team should first tour the PSU and do a quick count to estimate the number of households. In general, any PSU with well over 100 households should be subdivided into segments of approximately equal size, ideally around 90-150 households each.

Each listing team should have segmentation forms available to them in the field. These may be paper or electronic forms. Instructions and examples of forms are provided in the Mapping household listing and segmentation online tool. Segmentation and selection of a sample segment will be carried out as described in Box 7.2.

Box 7.2. How to segment and select a sample segment

  1. Draw a location map of the entire PSU (see Fig. 7.1).
  2. Conduct a quick approximate count of the number of dwellings in the whole area. Where there are large multi-dwelling structures that are likely to include many households, such as an apartment block, information should be obtained about the likely number of households in the structure.
  3. Using clear boundaries, such as roads, paths, or streams, divide the PSU into segments that contain roughly equal numbers of dwellings. There may be considerable differences in geographic size between segments, depending on population density.
  4. Indicate the boundaries of the newly created segments on the location map.
  5. Number the segments sequentially.
  6. For each segment, do a quick approximate count of the number of dwellings and likely number of households.
  7. If the segment is still too large (well over 100 households), then divide it further into smaller areas, called sub-segments.
  8. Using the “Mapping household listing and segmentation” online form, record the PSU number and locality, and indicate the number of dwellings, percentage and cumulative percentage per segment in the appropriate columns.
  9. Using a random number table or generator, select a random number between 1 and the total number of segments. A Random number table is available in the online tools.
  10. Select the segment with this number as the cluster to be surveyed.
  11. Draw a full sketch map of the selected segment and list all the households found (see Fig. 7.2); a sample listing form is available in the Household Listing online tool.
  12. Select households using a systematic random process (SRS) described in Module 4: Survey design.

For the purpose of segmenting a cluster, the initial count of households does not have to be precise. A close approximate count of dwellings and likely number of households is sufficient. It is acceptable to have a slightly unequal number of households per segment in order to create segments with clearly identifiable boundaries.

Fig. 7.1 shows an example of a cluster location map, while Fig. 7.2 shows a detailed sketch map of the cluster.

Fig. 7.1. Example map showing the location of the selected cluster Ngaku, Code EA01009a

file

a MICS manual for mapping and household listing (25 February 2019). New York: United Nations Children’s Fund; 2019 (http://mics.unicef.org/files?job=W1siZiIsIjIwMTkvMDIvMjYvMTkvMzEvMzAvOTU5L01JQ1NfTWFudWFsX2Zvcl9NYXBwaW5nX2FuZF9Ib3VzZWhvbGRfTGlzdGluZ18yMDE5MDIyNS56aXAiXV0&sha=1822015d5f32e1e5; accessed 17 June 2020).

Fig. 7.2. Sketch map showing structures within cluster Ngaku, Code EA01009 a

file

a MICS manual for mapping and household listing (25 February 2019). New York: United Nations Children’s Fund; 2019 (http://mics.unicef.org/files?job=W1siZiIsIjIwMTkvMDIvMjYvMTkvMzEvMzAvOTU5L01JQ1NfTWFudWFsX2Zvcl9NYXBwaW5nX2FuZF9Ib3VzZWhvbGRfTGlzdGluZ18yMDE5MDIyNS56aXAiXV0&sha=1822015d5f32e1e5; accessed 17 June 2020).

Segmenting urban areas may be easier than segmenting rural areas. Cities and towns are usually organized into blocks or other similar units. When using census enumeration areas (EAs) in cities and larger towns, maps are often available that show streets and blocks. If they are not available, such maps can be easily drawn. A quick drive through the area will provide a sense of whether there are an approximately equal number of dwellings of similar size per block. If so, the cluster could be segmented by block or parts of blocks.

For example, in an urban PSU that includes 18 very similar blocks, where the number of dwellings per block would also be expected to be similar, estimate the number of dwellings per block. If each block contains approximately 50 dwellings, the total number would be 900. This could be divided to give seven segments of approximately 125 dwellings (or 2.5 blocks) per segment. If the number of dwellings per block or the expected number of households per dwelling varies considerably, the number in each would need to be estimated before dividing into approximately equal segments.

In rural areas, it is likely that several households may exist in the same compound and that the number in each compound may need to be estimated. Large clusters, such as cities, may already have political subdivisions with an estimated number of households or population size. These can be used to define segments.

Fig. 7.3 illustrates the standard decisions and steps to take depending on the original PSU size. In this example, it is assumed that the average household size is 7.4 individuals, the number of households to be assessed is 30, and the goal is to identify a segment with approximately 100 households (in other words, a segment with around three times the number of households that will be included in the sample).

There are several advantages and disadvantages to be weighed in determining the segment size. For example, if households are selected systematically over a large area, the amount of time required for the teams to get from one household to the next may make supervision more difficult. However, if a cluster size is very small, the diversity of the selected sample is reduced and the design effect (DEFF) is increased.

If the number of households in the PSU is not known, but an estimate of the population size is available, then this information can be used to determine the expected number of segments required, based on the average household size in the country and the desired number of households per segment. For example, if there is an average of 7.4 persons per household, and the aim is to identify segments of around 100 households, then the expected total population of these households would be 740. If it is known that the PSU has a population of approximately 1350, the number of segments would be:

1350 ÷ 740 = 1.8 (this will be rounded up to 2)

In this example, the PSU should be divided into two segments of approximately equal size, one of which should be randomly selected and its households listed.

If the total estimated population of the PSU is less than 450, segmentation is not usually necessary unless the average household size is below 2.5, equivalent to approximately 180 households. It should be possible to count the total number of households in a PSU that has fewer than 150 households and select the required number of households to be surveyed.

Fig. 7.3. Example of PSU size (number of households or population size) and approaches to determining the segment to be surveyed a

file

a This example assumes that 30 households are to be selected in each cluster and that there are approximately 7.4 individuals per household.

How to select households in a cluster

All households in the selected cluster must be identified, and each should have an equal probability of being selected. To ensure this, the listing teams should follow the instructions and complete the household listing form (electronic or paper-based) provided in the Generic household listing form online tool. In some situations, only households with a specific population group (for example children 6–23 months of age) will be selected for the survey, in which case information to identify these households needs to be included in the household listing. Only those households will be included on the list used for the selection procedures described below.

As mentioned in Module 6: Selecting clusters, most cluster surveys sample the same number of households in each cluster. This is generally conducted using systematic random sampling of the predetermined number of households from an ordered household list to obtain wide geographic distribution.

If the cluster mapping and household listing have been conducted in advance, then household selection needs to be done centrally once the listing forms are completed. If it has not been done in advance, the same process will be conducted in the field using the listing developed immediately prior to the survey sample collection.

This is an example from a MICS survey: Step 1: In the final column of the Household listing form, “Survey HH number,” start with the number 1 and assign a sequential number to each household listed that meets one of the three following criteria:

  • Occupied residential dwellings
  • Households that refused to cooperate during household listing; or
  • Households whose occupants were temporarily absent during household listing

Leave the cell blank if the dwelling unit is not occupied or the structure is not a residential structure. For each cluster, the number assigned to the last household on the list corresponds to the total number of households for that cluster.

Step 2: Record this total number of households in the MICS template for systematic random selection of HHs1 (“MICS Household Selection Template”). If the cluster was selected after segmentation, record the proportion that the selected segment represents in the PSU/EA in the “Proportion of the selected segment” column. See the Mapping houshold listing and segmentation online tool for this information. If no segmentation was carried out, leave the value of 1 in the column “Proportion of the selected segment”.

Step 3: The MICS template for systematic random selection of HHs will automatically generate the numbers for those households to be interviewed in the survey for each cluster. The selected households should be indicated on the Household listing online form by circling the corresponding number in the “Survey HH number” column. If household selection is done in the field and if it is culturally acceptable, where possible, mark the number on the door frame of the structures selected, using a marker or chalk.

Other approaches are sometimes used to identify survey households, such as having a random starting household and then selecting households in a specified direction, or using the “next nearest household,” as is frequently done in Expanded Program on Immunization (EPI) surveys. This is not recommended for micronutrient surveys because it results in a less-dispersed random selection and increases the design effect (specifically, it decreases the likely diversity of selected households).

  1. The http://mics.unicef.org/tools#survey-design site has a useful Excel sheet for the random selection of households. 

Identifying and selecting eligible individuals in the household

Rules for selecting individuals who meet different eligibility criteria need to be defined in the survey protocol. Methods include selecting all eligible individuals in each selected household, or randomly selecting one or more of the eligible individuals in a selected household, or selecting all or a random selection of individuals who meet certain eligibility requirements in every nth household, for example every 2nd or 3rd household. The agreed approach must be systematic and applied in the same way in all clusters. For example, if non-pregnant women of reproductive age is a population group of interest and there are three women meeting this criterion in the household, all three could be requested to participate. Alternatively, one of the three women may be randomly selected and requested to participate. (Note that in this case there would not be substitution by another woman if the first one refuses consent.) This methodology may be applied in all selected households or in a pre-determined number of households that are systematically selected from the household listing.

The decision on selection methodology is usually based on the target sample size for a specific population group, the proportion of this group in the population and the total number of selected households per cluster, as described in Module 5: Sample size. For example, in the 2015–2016 Malawi Micronutrient Survey 1where 22 households were selected per cluster, survey participants from different eligibility groups were enrolled according to the following schematic (see Fig. 7.4):

  • All preschool-age children from all households.
  • All non-pregnant WRA age from each of nine households.
  • All school-age children from each of six households.
  • All men from each of four households.

The required number of households for each eligible group was selected by systematic sampling from the 22 selected households. This schematic enabled the survey to meet the sample sizes required to achieve the survey objectives for each population group of interest, while ensuring that there was no unnecessary burden of purchasing and transporting survey supplies.

Fig. 7.4. Malawi 2015–16 data collection schematic indicating demographic group eligibility per cluster

Household 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Preschool-age children X X X X X X X X X X X X X X X X X X X X X X
School-age children X X X X X X
Women of reproductive age X X X X X X X X X
Men X X X X
  1. National Statistical Office, Community Health Sciences Unit [Malawi], US Centers for Disease Control and Prevention (CDC), Emory University. Malawi Micronutrient Survey 2015-16. Atlanta: US Centers for Disease Control and Prevention; 2017 (https://dhsprogram.com/pubs/pdf/FR319/FR319.m.final.pdf, accessed 17 June 2020).