The Magazine for|
Esri Software Users
Nationwide Household Survey
Counts on GIS
By William D. Wheaton
Research Triangle Institute
"Youth Drug Use Decreases; Overall Rates Remain Level"
When the nightly news reports on trends in drug use in the United States, where does that information come from and how is it collected? The Research Triangle Institute (RTI), headquartered in North Carolina, conducts research in health and pharmaceuticals, environmental protection, advanced technology, survey research, and public policy. RTI administers the National Household Survey on Drug Abuse (NHSDA) that collects national data on both illicit and licit drug use. GIS technology plays a vital role in managing this enormous project. The survey, sponsored by the Substance Abuse and Mental Health Services Administration (SAMHSA) within the U.S. Department of Health and Human Services (HHS, is the primary source of statistical information on the use of legal and illegal drugs by the U.S. population.
--Press release by the U.S. Department of Health and Human Services
August 18, 1999
Since 1988, RTI has conducted the yearly NHSDA. Based on a representative sample of the U.S. population age 12 and older, the survey includes residents of households, noninstitutional group quarters (e.g., shelters, rooming houses, dormitories), and civilians living on military bases. The 1998 NHSDA, conducted from January through December 1998, employed a multistage area probability sample that required interviewing 25,500 persons.
In 1999, the number of interviews was increased to approximately 70,000 and the survey's geographic scope and complexity were expanded. More than 1,100 individual field interviewers, located throughout the United States, conducted these interviews in 7,200 discrete geographic areas. Within each area individuals were chosen for survey interviews.
"Because geography is an inherently important component of the survey collection effort, GIS is an integral part of RTI's implementation and management of the survey," explains Ross Curry, a GIS analyst for RTI. "If field interviewers collect data in the wrong place or from the wrong household, results of the survey may not meet the accuracy standards demanded by the survey design." Field interviewers need detailed street-level maps to ensure that the correct areas are canvassed for interviews. RTI's GIS program developed the Survey Automated Mapping System (SAMS), a custom ArcInfo application that generates the detailed maps used in conducting the survey.
Sample Design and Data Collection Process
Selecting representative samples of the population for field interviews is a complex task. A three-step process determines the specific individuals who will be interviewed. This process includes choosing the areas to sample, choosing specific households from within those sample areas, and selecting individuals from those households to interview.
The survey design mandates that interviews be carried out in every state. To ensure that enough diverse household interviews are conducted in each state, sets of census blocks that include households representative of the state's population as a whole are assembled. These areas, called segments, must have a minimum of 150 dwelling units.
Once the segments have been determined, SAMS reads input files consisting of the census block numbers of each segment and generates detailed maps illustrating the location of each segment. A segment can vary from one to several hundred individual census blocks. A set of maps showing the location of the segment within county and census tract boundaries is generated. Individual streets bounding and within the segment are shown on additional maps. A segment may require as few as five maps or as many as 50. For the 1999 survey year, over 75,000 unique maps were generated by SAMS in a 10-week period.
These maps, gathered into segment kits, are distributed to the field staff. Specially trained field staff, called counters and listers or just listers, drive to the location of each segment and count and list the actual dwelling units within that segment. The listers mark the segment block listing map with the location of each dwelling unit, note the address, and write a brief description of each dwelling unit on a coding sheet. Field listers drive along the segment streets in a predetermined direction so that when the actual field interviews take place, dwelling units are encountered in the same order. Coding forms containing the lists of dwelling units are returned to RTI for entry into the database used in the next step¾selecting the household sample.
The counting and listing data determines which individual dwelling units (households) will be contacted for interviews. This list of households to be contacted within each segment is the household sample. The households selected for the sample must be distributed throughout each segment. Once the household sample is drawn, field interviewers return to the segments and begin knocking on doors and compiling a roster of the residents in each household. From these rosters, one, two, or no residents, aged 12 or older, are selected at random for interviewing.
Interviewers take about an hour to administer the survey questionnaire using a laptop computer. During some portions of the interview, the field worker reads questions and records the respondent's answers directly into the computer. For more sensitive questions, respondents listen to the questions through a headset or read the questions and key their answers in the laptop. Interviewers make every effort to ensure that no one other than the respondent can see or hear the questions. The names of respondents are not collected or associated with the interview data. Identification information, such as the household address, is stored separately from the data so that the respondent's answers remain completely confidential. Interviewers transmit the completed interview data daily directly to RTI for subsequent processing.
Using GIS to Conduct the Survey
The SAMS application, based on ArcInfo 7.2.1, runs on a Sun SPARCengine Ultra1/170. The 1995 TIGER/Line files are the source for geographic data. SAMS creates maps with very little user input. The program reads an input file containing the number of dwelling units within the segment according to the 1990 Census as well as state, county, census tract, and block numbers for each segment to be mapped. According to Meg Wilkinson, GIS analyst, "Each map in the segment kits must conform to a standard format and contain consistent information. Segment identifiers, lists of block numbers, legend information, scale, and other details must be accurate on every map¾otherwise the listers can't do their jobs in an efficient manner." Two kinds of maps are produced¾Locator Maps and Block Listing Maps.
The Locator Maps are actually a set of three related maps that help field interviewers find each segment. They show the location of each census tract containing a segment in the county, the location of each segment within each census tract, and the location of individual blocks within each segment.
Field staff mark the location of each dwelling unit on the Block Listing Maps during the counting and listing process. Individual census blocks must be shown at a scale that allows enough white space to write the location of each dwelling unit. Sometimes only a single Block Listing Map is needed. At other times as many as 50 Block Listing Maps are required. When the street network of a segment is too dense to allow sufficient space for writing, a series of interactive zoom maps are created to break the segment into a number of individual maps. Each map may contain one or more census blocks. An Avenue-based application called Setting and Zooming (SAZ) generates these zoom maps. Each zoom map conforms to very stringent content rules.
Often field interviewers discover the locations and even the of existence some streets in a segment differ from what is shown on the Block Listing Maps. These differences are most often attributed to the fact that the TIGER '95 data is now four years old and to some fundamental inaccuracies in the TIGER '95 data. Interviewers draw new streets or other features directly onto the Block Listing Maps to show the location of all dwelling units in the segment. "Without GIS and the SAMS and SAZ software and the maps produced by them, it would have been impossible for us to complete the survey data collection under the time frames mandated by SAMHSA," says Tom Virag, operations manager for NHSDA.
Internet Map Server
GIS is also used by RTI's survey project managers and field supervisors. With 7,200 individual area segments and more than 1,100 field staff distributed all over the country, access to timely accurate mapping information has become a critical component in managing the logistics of this enormous project. Esri's Internet Map Server (IMS) software is used to display the locations of field interviewers, field supervisors, and regional supervisors. Personnel turnover presents a special problem that can really only be managed with databases and maps. Each week, a new database of the currently active field personnel is generated. IMS technology is used to display the current number and location of staff members.
The IMS application also extracts status information from a database and creates maps showing various statistics that are critical to management. For example, one type of map displays management regions across the country indicating which are behind schedule, on schedule, or ahead of schedule. These status maps let managers quickly see problems and shift resources to remedy the situations.
The counting and listing process is one of the most time consuming and expensive parts of the survey. Visually locating and noting each and every dwelling unit on a map drive up labor costs. Research will begin on methods of seeding area segments with the locations of known dwelling units based on available mailing address information. By including the locations and addresses of known dwelling units in the Block Listing Maps and coding sheets, field interviewers need only supply information on omitted dwelling units and the number of households in apartment buildings and group quarters. Another option that is being explored is to allow field interviewers to generate their own Locator Maps and Block Listing Maps in the field. This would reduce the amount of overhead necessary to generate and distribute paper copies of the maps.
Collecting data for a nationwide survey such as the NHSDA is an enormous undertaking. Data must be accurate, dependable, and stand up to rigorous statistical analysis. The use of GIS helps ensure that data is collected in exactly the right locations throughout the country. Conducting a survey of this size is also an enormous logistical challenge. Access to timely and accurate map-based information through simple Web browser interfaces provides critical information to survey management. With this information, operational problems can be identified quickly and appropriate actions can be taken to resolve them.
For further information, contact
GIS Program Manager
Research Triangle Institute
Photos courtesy of Research Triangle Institute staff photographers.