The response rate of a survey is the proportion of sampled individuals (or
households) that complete the survey. When the response rate is low, there is
potential for nonresponse bias in survey results. That is, the people who choose
to respond to the survey may answer questions differently from people who did
not participate. In such a case a survey may provide very accurate and precise
estimates for the population of people who respond to surveys, but perhaps not
for the entire population. Many surveys are conducted from lists of known
individuals, but in order to study larger populations the method of choice is
random-digit dialing (RDD)—which also presents the greatest challenges to
achieving high response rates.
Many researchers believe that only very high response rates are acceptable for
scientific or serious policy purposes. For example, the National Center for
Education Statistics specifies a minimum response rate of 70% RDD surveys
[1]. Although the Federal Office of Management and Budget is often cited as
mandating response rates of 80%, the most recently released standards [2]
simply state “Plan for a nonresponse bias analysis if the expected unit response
rate is below 80 percent.” Many different figures can be found by searching the
internet, but it is rare to find empirical justification for any given
recommendation.
In recent years response rate calculations have become much more
standardized, thanks to the efforts of organizations like the American
Association for Public Opinion Research (AAPOR,
http://www.aapor.org) and the Council of American Survey Research
Associations (CASRO, http://www.casro.org). Each of these organizations has
widely used formulas for response rate calculations on their web sites, with
ample supporting documentation. With random telephone surveys there will
always be a lot of uncertainty in response rate calculations because so many
telephone numbers are never answered but cannot be ruled out as potential
households. This requires estimation of the proportion of these numbers that
represent eligible households. Despite the increased standardization, it is
important to remember that many survey response rates are still calculated and
reported using different methods and are not always comparable.
One of the few things nearly everyone agrees about in survey research is that
response rates are falling, and have been falling for many years. One widely
cited report describes response rates for the University of Michigan's national
Survey of Consumer Attitudes as falling on average one percentage point over
the past twenty-five years, with the decline accelerating in more recent years
[3]. This high-standard survey achieved a response rate of 72% in 1979, but had
declined to 48% in 2003 despite significant efforts (at great expense) to slow
or reverse the trend. In some states and in urban areas, the problem is more
severe. The California Health Interview survey, one of the most extensive single
-state surveys conducted anywhere, achieved response rates of 38% in 2001
and 34% in 2003 [4]. Faced with the impossible task of meeting the standards
of the previous generation, practitioners of survey research must grapple with
difficult questions in order to provide accurate reliable information to the
public, policy makers, and the scientific community.
In some cases, it doesn’t make much difference
One of the most influential recent studies of response rate issues found that
national opinion survey results are remarkably robust to response rate
differences. In a study conducted in 1997 by the Pew Research Center for the
People and the Press [5], a “rigorous” survey designed to maximize response
rate (61% response rate, 8 weeks of data collection) was compared to a short-
turnaround “standard” survey more characteristic of media polling (36% response
rate, 5 days of data collection), only 14 of 91 comparisons differed
significantly, with an average difference of about 2 percentage points (the largest difference
was 9 percentage points). Given that a 5-day poll can produce results so similar
to the more expensive and time consuming “rigorous” survey, how can we justify
the effort and expense required to maximize response rates?
In the first place, we must recognize that survey results are used for many
different purposes. If the goal is to achieve a general understanding of public
opinion on broad issues, there may indeed be little difference. But when the
results are being used to estimate significant public spending or health
outcomes, a difference of a few percentage points could translate to large
amounts of money or substantial numbers of individuals experiencing injury,
disease, or death. It is not clear that the “small differences” in the Pew
survey are equivalent to “no differences”. Secondly, even in a national survey it is
hardly “standard” for a 5-day poll to achieve a 36% response rate. In California
even the 2001 California Health Interview Survey, which far exceeded the efforts
included in Pew's “rigorous” survey, only reached 38%. There is good reason to
believe that 36% is not the norm for the commonly reported political and media
polls.
It is difficult to identify realistic figures for response rates obtained in
modern fast-turnaround media polling. A recent study conducted by Allyson Holbrook,
Jon Krosnick, and Alison Pfent involved compiling results from 114 RDD surveys
by 14 survey organizations, using common methods to calculate response rates
[6]. The response rates ranged from 4% to 70%, with field period length ranging
from 2 to 399 days. The results don't show clearly the relationship between
field period and response rate, but they do show that the short-period polls
sometimes have response rates that are quite low. The Council for Marketing
and Opinion Research (CMOR, http://www.cmor.org) reported combined
response rates of 11.7% in 2002, representing the average result from a large
number of polls conducted by member organizations [7, 8]. Of course, if the
average was 11.7%, many of the polls will be considerably lower. Perhaps the
question should be: If there is little difference in the results from surveys
with response rates of 61% vs. 36%, can we expect the same comparability if we
compare surveys with response rates of 36% vs. those that are lower than 12%?
And would either compare usefully to the 80% specified in the OMB standards?
Although there is yet no definitive research available to answer this question,
there are a few hints. Most encouraging are the conclusions of Holbrook,
Krosnick, and Pfent: “Response rates continue to decrease over time, but lower
response rates seem not to substantially decrease demographic
representativeness within the range we examined. This evidence challenges the
assumptions that response rates are a key indicator of survey data quality and
that efforts to increase response rates will necessarily be worth the effort and
expense.” Given that the range of surveys in this study included rates as low as
4%, these results should allay many concerns. However, other recent research
shows that the question is far from settled.
In some cases, it makes a big difference
The surprising resilience of RDD survey samples with regard to demographic
representation and opinion measurements may lead us to overlook instances
where respondents differ from nonrespondents on important dimensions. In an
innovative analysis from a statewide survey in Illinois, Timothy Johnson and
colleagues used multilevel models to combine census data on the ZIP code level
with response rate and substantive data from a survey on substance abuse
issues [9]. There were no correlates of nonresponse on the ZIP code level for
questions concerning driving under the influence of alcohol. However, reports of
partner violence were found to be significantly related to nonresponse depending
on small-area measures of household income and size of housing units. In
some areas, these differences could lead to overestimation of intimate partner
violence (reports of forced sex in high income areas), but in other cases
underestimation is likely (partner isolation and partner abuse in areas with
smaller housing units). The possibility that these errors may “cancel each other
out” does not diminish their potential importance.
A second example comes from the Behavioral Risk Factor Surveillance Survey
(BRFSS) conducted by the Centers for Disease Control (CDC). Combining
census data at the County level with the 2003 national BRFSS data, counties
with greater populations of African American residents and residents who did not
speak English had significantly lower response rates than other counties [10].
Given that the BRFSS is offered in many languages, these results suggest that
less comprehensive survey efforts may under-represent these populations to an
even greater degree. The presence and seriousness of racial and ethnic health
disparities represents an important element of the Healthy People 2010 agenda
for improving the nation’s health [7].
At PRI, we found a similar pattern in a recent survey of public trust and
confidence in the California Courts conducted for the Judicial Council of
California, Administrative Office of the Courts [8]. The number of call attempts
required to complete interviews with residents who did not speak English was
more than 1.5 times the number required for English speaking respondents
(mean attempts of 6.4 vs. 4.0, respectively). Respondents interviewed in
English required seven or more attempts only 19% of the time, whereas 38% of
those interviewed in Spanish or Chinese required 7 or more call attempts before
being interviewed. Had we not conducted extensive follow-up activities over a
long study period of 3 months, we would have under-represented immigrants
(27% of California’s population) and residents who are not comfortable speaking
English (40% of California’s population older than 5 yrs speak English less than
“very well” according to the 2005 American Community Survey;
http://factfinder.census.gov). We also would have overestimated the percentage
of California residents who have personal experience with the California courts,
while underestimating the importance of language and child care as potential
barriers to court access.
A practical response to nonresponse
General population surveys conducted at PRI usually involve compromises
between scientific standards, time pressures, and budget limitations. Most
often we do not use incentive payments, but we do as much follow-up calling as
possible, attempt refusal conversions on selected cases, and conduct interviews
in Spanish and Chinese (Mandarin and Cantonese) whenever possible. Recent
RDD surveys for the entire state of California have achieved response rates of
30 – 38%, and those conducted in the City of San Francisco have ranged from 11%
to 33%. Higher rates are most likely achievable through the use of incentive
payments, extended calling periods, and list-assisted sampling methods. There
are many methods known to work, but each can add incrementally to the cost
and time required to conduct a survey.
The continued development of research on nonresponse bias provides comforting
news in that RDD surveys can still provide surprisingly accurate and reliable
estimates even in an era of declining response rates. But this same research
also carries a warning that in some situations our estimates can be biased in
important ways by nonresponse. At PRI, we believe that efforts should always
be made to maximize response rates, even in surveys of modest scope. At the
same time, it is more important than ever to conduct new research to better
understand the relationship between nonresponse and the results and policy
implications of our surveys.
Reference
1. http://nces.ed.gov/statprog/2002/std2_2.asp
2.
http://www.whitehouse.gov/omb/inforeg/statpolicy.html
3. Curtin, R., Presser, S., & Singer, E. (2005). Changes in telephone
survey nonresponse over the past quarter century. Public Opinion Quarterly 69
(1), 87-98.
4. California Health Interview Survey. CHIS 2003 Methodology Series:
Report 4 – Response Rates. Los Angeles, CA: UCLA Center for Health Policy
Research, 2005.
5. Keeter, S., Miller, C., Kohut, A., Groves, R.M., & Presser, S. (2000).
Consequences of Reducing Nonresponse in a National Telephone Survey.
Public Opinion Quarterly 64 (2), 125-148.
6. Holbrook, A.L., Krosnick, J.A., & Pfent, A.M. (in press). Response
rates in surveys by the news media and government contractor survey research
firms. In J. Lepkowski, B. Harris-Kojetin, P.J. Lavrakas, C. Tucker, E. de Leeuw,
M. Link, M. Brick, L. Japec, & R. Sangster (Eds.), Telephone Survey
Methodology. New York: Wiley.
7. As cited by Davis, H. (2003). Are you talking to the right people?
Quirk’s Marketing Research Review, 1134.
8. Reported by SurveyUSA as 10.9% based on 528 telephone surveys
accessed from the CMOR database,
http://www.surveyusa.com/2002vs2001ResponseRates030529.pdf.
9. Johnson, T.B., Holbrook, A.L., Young, I.C., & Bossarte, R.M. (2006).
Nonresponse error in injury-risk surveys. American Journal of Preventive
Medicine, 31 (5), 427-436.
10. Link, M.W., Mokdad, A.H., Stackhouse, F., & Flowers, N.T. (2006).
Race, ethnicity, and linguistic isolation as determinants of participation in public
health surveillance surveys. Preventing Chronic Disease [serial online]. Available
from:
http://www.cdc.gov/pcd/issues/2006/jan/05_0055.htm.
11. U.S. Department of Health and Human Services. Healthy People 2010.
2nd ed. With Understanding and Improving Health and Objectives for Improving
Health. 2 vols. Washington, DC: U.S. Government Printing Office, November
2000.
12. Rogers, J.D. & Godard, D. (2005). Persistent Callbacks and Linguistic
Representation: Examples from a Survey of Trust and Confidence in the
California Courts. Paper presented at the annual meeting of the Pacific
Association for Public Opinion Research, December 15 – 16, San Francisco,
CA.
|