Consumer expenditure survey: Its new methodology is superior

T. C. A. Anant, 12 Jun 2024, The Mint

National sample surveys, in order to ensure better coverage of a heterogenous population, adopt a multi-stage stratification approach.


Its data quality has improved even if changes in how Indian households are polled reduce its comparability with the past.

India’s ministry of statistics and programme implementation (Mospi) recently released a full report on the results of the Household Consumer Expenditure Survey (HCES) along with unit-level data. A factsheet highlighting some key results had been released earlier this year. This survey has made a number of changes in methodology, which has caused some confusion among the commentariat. We will briefly examine these changes and their rationale.

Changes in Schedules of Inquiry: The HCES 2022-23 canvassed household consumer expenditure through four sub-schedules, three covering consumption, i.e. Food (FDQ), Consumables and Services (CSQ) and Durables( DGQ), and one covering Household Characteristics (HCQ).

The HCQ schedule was canvassed in the first visit, along with one of the three consumption schedules, the remaining two being canvassed over two separate visits later. As a consequence, each household was visited thrice over a three month period. The sequence in which the schedules were canvassed were randomized to eliminate any bias on account of schedule ordering.

These changes were introduced for a practical reason. The earlier approach was to canvass a single comprehensive Schedule of Enquiry covering all three baskets. Over time, the survey instrument became longer as new items of consumption were added.

This led to complaints of an excessively long interview. A questionnaire could last as long as 180 minutes. Further, the order of items was fixed for each interview. Its length, along with a fixed order, created many different errors. Interviewer and respondent fatigue contributed to deterioration in data quality. Later items often got poor quality responses, leading to possible under-coverage.

The excessive length was also a factor in households refusing to participate in the survey. Therefore, numerous committees of the National Statistical Commission (NSC) had recommended that we rationalize and simplify the National Sample Survey (NSS) Consumer Expenditure Schedule.

These changes in the method of enquiry have implications for estimates of household consumer expenditure. Since each consumption basket is now similarly canvassed, the bias on account of positioning has been eliminated. Under-reporting of non-food items, particularly durable goods and services, gets reduced. This bias reduction is partly the reason behind the decline in the share of Food in total consumption.

Changes in stratification: National sample surveys, in order to ensure better coverage of a heterogenous population, adopt a multi-stage stratification approach. In this survey, some changes were made in how this is done. In the rural sample, a special strata of zero-population villages was created, and the remainer were broken into two strata, based on distance from urban centres.

This has ensured that the under-sampling of small villages, far from urban areas, is reduced. Secondly, in selected villages or urban blocks, a second stage sub-stratification was done on the basis of land ownership in rural areas and four-wheelers in urban areas. This was a change from the earlier approach based on ‘affluence.’ The reason why we adopt second-stage stratification is to ensure that we cover all types of households adequately.

The concern has been of under-coverage of affluent households. This under-coverage happens both because affluent households are fewer in number and have a much higher non-response rate. The affluence criteria adopted earlier had two types of problems: one, the method of detecting affluence was highly subjective, and two, there was non-comparability across different villages/ urban blocks.

So an ‘affluent’ household in one village/ urban block could end up as being ‘non-affluent’ in another. This low (and possibly biased) coverage of better-off households was also a factor in the under-estimation of expenditure on durable goods and services. The changed approach reduces (and possibly eliminates) subjectivity and will allow us to capture heterogeneity better.

In this context, a common misconception is that the sample size should be related to population share. In sample design, we actually do the reverse. The effort is to over-sample small or rare entities, as they are more likely to be missed. The estimates are not affected by such design changes, as they are generated by combining the sample estimate with a multiplier which reflects the importance of the observation in the population.

The new HCES and Periodic Labour Force Survey offer better estimates of aggregate characteristics because they make a better effort to capture these small but important components of the population.

Implications of improved estimates: Since new HCES estimates of consumer expenditure are a more accurate representation of consumption patterns than past surveys, some people argue that we therefore need a new poverty measure. As I had argued in this column earlier, while there is a case for revisiting poverty measurement, it is not because of changes in measurement protocols, but because of changes in behaviour.

A second query often raised is whether we should have done, for purposes of comparability, a separate sample based on the old Schedule and design. This was done in 2009 and 2011, when a new Schedule based on the MMRP was introduced. There are two problems with this suggestion. The current HCES already has a very large sample of 18 households in each village/ block.

A sub-sample using the old design would mean pushing this to at least eight more households in each first-stage sampling unit (FSU). This will add to cost and possibly reduce data quality in smaller villages and blocks. Overlaying two separate sub-stratifications in the same FSU will add to complexity and increase non-sampling error. A second problem is that the old design was already stretching response error to breaking point.

It is possible that the failure of the 2017-18 survey was on account of such design-related complexities. This could have been analyzed if the detailed data for that survey were available. It would have been better if Mospi were to release those data-sets while highlighting its problems. After all, flawed data-sets from 1999-2000are still available and they hold valuable lessons.

To sum up, improvements in data quality are welcome even if methodology changes result in issues cropping up of comparability with the past. The requirement of comparable data is no reason to stop changes that serve a valid purpose. Analysts will simply need to take this into account.,covering%20Household%20Characteristics%20(HCQ)