Register with us

Lack of census data and use of electoral roll-based sampling frame in specific studies

Selection of households and individuals is one of the most important tasks of developing sampling designs. Traditionally, in developing countries where population data is mostly available in the censuses conducted by the government, this information forms the basis of the house-listing done for the purpose of selection of households in a sample survey. In India as well, researchers mostly depend on the census data for constituting robust and representative sampling frames using a house-listing exercise to implement a particular sampling technique, unlike in the USA or other similar countries which already have readily available sampling frames. The house-listing process is a cumbersome process and often requires time and financial resources, which may, at times, deter researchers from using these probability-based sampling techniques and resort to purposive sampling which often may not be representative and may not provide unbiased information and data.

In this context, IWWAGE explored using electoral rolls as an alternative to using population census as sampling frames in the selection of household or individuals for some of our recent studies. Using electoral rolls for studies aimed at policy-making is a relatively new trend. In the absence of updated census data as well as to ensure minimised time and resource requirements, exploring electoral rolls as sampling frames may be useful, although its use for household surveys is relatively sparse in India (Vaishnav, 2021; Joshi et al. 2020). At IWWAGE, we have used electoral roll sampling frame for selection of individuals for the surveys in two of our studies on labour force participation. The data collection for one study, viz., ‘Women’s Labour Force Participation in Select States in India’, was conducted during November, 2021 and January, 2022 and the other one is an ongoing study on ‘Capturing women’s work to measure better’.

The completed study majorly aimed at unpacking the enablers and barriers of women’s labour force participation and suggesting actionable points based on the findings. For this study, approximately 5000 females and 1000 males were interviewed, from five states of India, namely Jharkhand, Karnataka, Delhi, Rajasthan, and Madhya Pradesh. The main objectives of the ongoing study are to capture women’s work comprehensively by identifying the varied, yet major, forms of paid and unpaid activities, listing activities by categories of work, and developing mechanism of estimating the simultaneity of engagement of women. Approximately 4000 females and 800 males have been surveyed from the states of Jharkhand and Karnataka for the study.

It is worth mentioning that, in a multi-stage cluster sampling context, selection of individuals (or households) from the electoral roll frame entails selection of polling booth in the previous stage of sample selection as opposed to more traditional approach of selecting villages in rural areas and census enumeration blocks (CEB) or urban frame survey (UFS) blocks in urban areas. In this note, we outline the advantages and challenges of using electoral rolls in selecting individuals based on our experience of conducting the two studies mentioned above.

Advantages of using electoral rolls in constructing the sampling frame

In addition to being time and cost efficient, there were multiple other advantages of using electoral-rolls as an alternative sampling frame, particularly in our studies. This technique provides us direct access to individual-level information like age, gender etc, that enables selection of a random sample, stratified on the basis of individual characteristics. It also allows us to minimise respondent bias by enabling enquiry from each individual rather than elicit information from only one member of the sample household who may then be the representative of the sampling unit and respond ‘on behalf’ of others, which may carry certain biases.

Also, the electoral-roll based sampling is a better alternative to the non-probabilistic sampling methods where sample selection often relied upon the ease of access to respondents and thus leads to a non-random, non-representative sample.

  1. Challenges of using electoral-rolls in constructing the sampling frame:

However, there exist a few challenges of the electoral roll-based sampling, as described below.

  1. Categorizing polling booths into rural and urban centers: In case of a few states, the rural-urban bifurcated list of polling booths is not directly available anywhere. In those cases, each polling booth has to be located in the Geographic Information System (GIS) software maps and categorized on the basis of information provided in the software. For example, in case of Karnataka, to know the rural/urban location of a polling booth, it has to be located in the GIS map and then the rural/urban location has to be decided depending on whether the polling booth is falling under a Hubli (indicating a rural area) or town (indicating an urban area).
  2. Translation from local language: In case of a few states (for example, Karnataka), where the list of polling booths and the electoral rolls are available in local languages only, translating in English and digitizing them, increase the risk of errors, and require robust monitoring and quality checks.
  3. Unavailability of voter rolls in convertible PDFs: Voter rolls are sometimes available online in standard PDF but in many cases, they are available as scanned copies of voter lists. These are difficult to convert into excel files, and hence sometimes entries of the listed individuals need to be done manually – increasing the cost and time in the digitization process. It also inbuilds a cost of manual supervision after the entries are completed in the excel file. In case of a large-scale coverage/nationally representative study, the manual process of making entries will be challenging.


  1. Challenges arising for electoral roll-,zbased sampling method while implementing the survey:
  2. Less frequent updating of the voter rolls: Less frequent updating of the voter rolls leads to difficulties in locating the respondents, especially in urban areas with high intra-city or inter-city out migration. Combining two of our studies, in about 20-30% instances, the respondents could not be located due to out-migration. However, as the geographical area of survey expands, this percentage comes down.
  3. Difficulty in locating respondents in dense settlements: In case of the densely populated urban areas, the houses located near the boundaries of the polling booths often get excluded from the electoral rolls corresponding to their own polling booth and get enrolled in the electoral rolls of the adjacent polling booths. This arises due to the fact that there is a cap on the number of voters in a polling booth and once the limit is reached, the remaining voters are to be enrolled in the neighbouring polling booth.
  4. Difficulty in locating women in younger age-cohort: It is also realized that locating women in the age-cohort of 18-24 years is far more difficult as compared to others. Younger women are much less likely to be listed in the voter rolls than other individuals and they also relocate more often after marriage, rendering themselves as untraceable in that particular polling booth.
  5. False entries: Sometimes the names or other information like age of the individuals does not match exactly leading to minor mismatch between the electoral roll entry and the original information of individuals. Also, the existence of false entries is found in the voter’s list.
  6. Voters not residing in the delimited area of a particular polling booth: In some cases, it is found that most of the respondents selected from the voter list of a particular booth, actually reside in a village far from the polling booth demarcated area. This is because voters in a particular area are assigned to other polling booths besides the one officially demarcated for the area.

To tackle the challenges of non-response and difficulty in locating the respondents, digitizing data of extra polling booths as buffers and preparing a list of respondents which include more numbers in addition to the required sample size for each group of respondents in each polling booth, would be a mitigating mechanism.

  1. Suggestions to facilitate a more convenient use of electoral rolls in constructing sampling frame:

Below are a few suggestions from our experience of using electoral rolls for constructing sampling frame to make the process more efficient:

  • providing the list of polling booths and electoral rolls in English;
  • indicating the rural/urban location of the polling booths in Chief electoral officer’s website;
  • making the electoral rolls available in convertible PDFs; and
  • more frequent updating of the electoral rolls.

Lastly an important limitation of using electoral rolls pertains to specific age cohorts. Since the electoral rolls include only the eligible voters, the sampling frame thus includes only those individuals who are 18 years and above. It would thus be relevant mainly for surveys that include specific age groups above a certain threshold.


This blog has been authored by Dr. Sona Mitra, Director- Policy and Research, IWWAGE; Dr. Bidisha Mondal, Research Fellow, IWWAGE; Prakriti Sharma, Senior Research Associate, IWWAGE; and Aneek Chowdhury, Research Associate, IWWAGE[1].

We are very thankful to Dr. Santanu Pramanick for his guidance through the process of developing a sampling frame. We are also grateful to Shri P C Mohanan for his comments in both phases of using the electoral rolls for our purpose.

Leave a Reply

Your email address will not be published.