October - December 2007
Editor's note: Say "census data" and most people think of the decennial census conducted by the United States Census Bureau in those years ending in zero. However, a new rolling survey is being phased in over the decade between the 2000 and 2010 census that will replace the long questionnaire previously used. Using data from this survey requires greater sophistication by GIS users. Esri's chief demographer Lynn Wombold explains the differences in how data is collected and can be used.
The most significant feature of the next census will be the omission of the long form. The short census questionnaire provides only the complete counts of population, households, and housing units plus the characteristics of sex, age, race, ethnicity, household relationship, occupancy, and tenure. Every other variable was previously collected from the long form.
Now data about income, education, employment, language, migration, citizenship, marital status, and housing characteristics, such as value and rent, will be obtained from the American Community Survey (ACS) instead of the census sample.
The topics covered by the ACS are similar to the census sample. However, the method of collecting the data is very different, which introduces marked differences in the results. The Census 2000 sample represented approximately 1 in 6 households and one point in time, April 1, 2000. ACS represents approximately 1 in 40 households and continuous measurement of population and housing characteristics through monthly surveys. It was adopted to diffuse the cost and effort of conducting the decennial sample survey during the decade and provide current information about changes in demographic and housing characteristics.
The conceptual difference between the one sample survey that is taken during the decennial census and a series of monthly surveys that are taken throughout the decade requires operational differences from the definition of variables through data collection and reporting.
The ACS definition of population differs from the census definition. To provide counts of the population for reapportionment and redistricting, the census enumerates the population at the "usual" place of residence. Because the counts are used for reapportionment, each person can have only one usual place of residence. In the ACS, a person can have only one place of residence at a time. The ACS definition of current residence is applied using a two-month rule. If the resident of the unit lives there for two months or more, they are the current resident at that address. Consequently, according to the ACS definition, a person can have more than one current residence in a year. Defining current residence this way will enable the ACS to profile seasonal populations for the first time. Although the information will reflect only those travelers who stay in one place for two months or more, it provides information about seasonal populations as compared to the census data, which reported only seasonally vacant housing units.
The definition of the population is not the only difference caused by the change in time frames used by the surveys. The ACS questionnaire reflects adjustments necessary to accommodate the continuous collection of data on a monthly basis, instead of once every 10 years. Monthly interviewing requires different reference periods. For example, previous residence refers to one year ago, not five years ago. Income refers to the previous 12 months, not the previous calendar year. To report income as an annual average, dollar amounts are not reported as is but are adjusted using the Consumer Price Index (CPI). Income must be adjusted to reflect the year reported because the ACS is compiled from monthly surveys taken from independent samples but is reported as annual or annual average estimates.
Sample size is a major difference between the census sample and the ACS. Understanding the gap between 1 in 6 (or 17 percent) and 1 in 40 (2.5 percent) does not require advanced knowledge of statistics. The ACS sample size is smaller. Smaller sample sizes impact small area estimates. Annual data is reported only for counties and places with a population of 65,000 or more. Data for areas with smaller populations, including tracts and block groups, must be reported as a three- or five-year average, depending on the population size. The smaller sample sizes in the ACS also increase the variability of estimates based on these samples. Even with a five-year accumulation of data, ACS-based estimates have a larger margin of error than the census sample in 2000.
Data collection is also different with the ACS. Approximately 80 percent of the census questionnaires in 2000 were delivered and returned by mail. In rural areas, enumerators either dropped off the questionnaires or collected the information on each unit. Nonresponse follow-up included all housing units but permitted proxy information to be collected from neighbors. The ACS is designed to be a mail survey. Incomplete addresses or post office boxes are transferred for computer-assisted personal interviewing (CAPI), where they are sampled at a 2-in-3 rate for follow-up. Nonresponses or incomplete responses are eligible for follow-up by computer-assisted telephone interviews or personal interviews. Collection and follow-up operations are conducted by trained professional staff, not by temporary workers.
The final step in the operation converts the survey data collected to estimates of population and housing by weighting the sample data. Census sample data was weighted to reflect the probability of sample selection and then adjusted by ratio estimation to be consistent with complete count tabulations of the population and occupied housing units by select characteristics. The product of the ratio estimation is a reduction in sampling error and possible bias that could have resulted by simply applying the inverse of the sampling rate. Another advantage of this procedure is that sample estimates are consistent with complete census counts.
ACS data must be weighted by postcensual population and housing unit estimates that are subject to estimation error, which increases the sampling error, and developed only for counties, places, and higher levels of geography. The symbiotic relationship between complete counts and sample census data is gone. ACS weights are unrelated to census counts.
At this point, it may seem as though a degree in statistics is needed to work with ACS data. The change from a census sample to the ACS is a challenge for all data users. Sample data can no longer be taken at face value. An understanding of the change in variable definitions, methods of collecting and reporting the data, sample weights, and standard errors will be necessary in order to use the data. The accompanying table summarizes the main differences between the Census 2000 sample and ACS data (as of 2005, the latest year). Questions from readers are welcome and can be sent to Lynn Wombold at firstname.lastname@example.org.