ArcNews Online

Fall 2006

Online Only Article Using GIS Technology to Track Virus Prevalence

Mapping Papilloma Virus Disease Data Contributes to Cancer Risk Assessments

  click to enlarge
Map of California human papilloma virus (HPV) cases by outpatient diagnosis in each county.

Genital human papilloma virus (HPV) infection is estimated to be one of the most common sexually transmitted diseases in the United States. HPV infection is associated with cancer of the cervix, penis, anus, and vulva. There are two common presentations of genital HPV infection: exophytic (or raised) warts (condyloma acuminatum) and subclinical (or flat) warts. Exophytic warts cause visible growths on the genitalia of both men and women. They are usually caused by HPV types 6 and 11. Though unsightly and problematic to treat, they are at low risk for malignancy. Subclinical warts, on the other hand, tend to be caused by HPV strains, such as 16 and 18, which are linked to genital cancers, including cervical cancer.

HPV infection is responsible for 99.7 percent of all cases of cervical cancer. This is the first malignancy that owes 100 percent of its attributable risk to a commonly acquired infection. According to the American Cancer Society, cervical cancer strikes approximately 10,000 women annually and kills about 3,700.

Clearly, since HPV is a precursor of the development of cervical cancer and precancers, it is important to determine the regional prevalence of HPV infection to assess cervical cancer risk in the female population.

California was chosen for HPV prevalence assessment due to the state's large population and availability of state hospital data to validate results. Esri Business Analyst Online; Medical Management Associates; and Planning 2.0, a health care consulting services company, provided the data.

To understand the full impact of HPV in California, Planning 2.0 analysts studied the western regional rates of the National Hospital Discharge Survey, the National Ambulatory Medical Survey; the Centers for Disease Control and Prevention's Behavioral Risk Factor Surveillance System; and the age, sex, and race population cohorts for California. From this data, they overlaid expected caseload data on the following geographies: State, Counties, ZIP Codes, Senate districts, House districts, and block groups. The analysis considered both the expected caseloads of HPV as well as the percent of cases against the female population for all California counties.

Findings indicated that San Francisco County possesses the highest estimated rates per female population for HPV and cervical cancer, while Yolo County had the highest rate of genital warts. For total caseloads, the counties with the highest population also have the highest estimated caseloads, with Los Angeles County leading in all three categories with an estimated 5,227 cases of HPV, 102,143 cases of genital warts, and 6,627 cases of cervical cancer. However, the caseload incidence rates vary throughout the state when comparing the percent of incidence instead of total caseloads.

Statewide expected caseloads for 2005 were as follows:

2005 California Expected Caseloads
Disease Classification (ICD-9 Codes)OutpatientInpatientTotal
HPV (79.4)17,40049517,895
Genital Warts (78.10, 78.11, 78.19)363,36917,582364,801
Cervical Cancer (180.0, 180.1, 180.8, 180.9, 233.1, 236.3)17,5825,54023,122

Another approach to viewing expected HPV in California is to compare expected cases to female population to derive which counties have the highest expected proportion of HPV cases to female population. From this map, the counties with the highest percentage as a function of female population are spread out across the state, which may indicate the need for better preventive education. While the overall yearly rate of diagnosis may be low, the cumulative effects over decades may be significant. Further research is indicated to compare cervical cancer rates to HPV for those areas with the highest rates of infection.

As with any projection, there are some limitations of the data, especially when it is projected to a very small population of a block group. Caution should be used when analyzing disease rate data in this manner. This study's core findings are based on disease, procedure, and "reason for visit" rates by three-, four- and five-digit ICD-9 codes. More than 150 different codes were analyzed for appropriateness, data richness, availability, consistency, and authentication.

More Information

For more information, contact Tod Fetherling, Planning 2.0 (tel.: 615-309-8376, e-mail:, Web:

Contact Us | Privacy | Legal | Site Map