Decoding Addresses From Point Layers
Final analyses sought to determine the accuracy of decoded addresses given known offset and squeeze measurements. The 277,817 potential addresses were geocoded, and then each point was assigned an address based on the closest geocoded point. Only coincident points and those too close to be differentiated by the map precision should produce decoding errors. Scripts performed this test for parameter combinations of offset at increments of 10 meters up to 40 meters and squeeze at increments of 10 percent up to 100 percent. As anticipated, tests with either parameter value at zero reduced accuracy. All other tests correctly identified more than 96 percent of the addresses.
The future holds great changes for address matching procedures. Geographic databases will continue developing and will more accurately depict reality. Both positional accuracy and currency will improve. Perhaps large distributed datasets will approach real-time representation. Interpolation using a street segment reference layer crudely places points according to theoretical rather than actual distribution along blocks. Advanced address matching techniques will incorporate point or polygon reference layers for buildings and property. Some organizations may integrate three-dimensional building diagrams to locate apartments, condos, and offices. Current trends suggest the evolution of fewer, more complete reference datasets. Assuming geocoding purposes remain relatively unchanged, the procedures will still be repeatable.
Efforts to prevent map hacking will explore methods of perturbing points. These strategies face the challenge of balancing communicating a distribution and preventing communication of exact addresses. Adding a small random locational adjustment might avoid discovery of exact addresses while maintaining the association of points with correct street segments. Depending on the particular data, existing database techniques could filter out the incorrect addresses along a segment. Larger coordinate modifications would prevent discovery of street segments at the expense of either distribution patterns or correlation between distributions and other GIS data.
For sets of 10 addresses within a suburban ZIP Code, offset was easily calculated and iterative tests perfectly discovered integer squeeze values. Using a known reference layer and the two deciphered parameter values, this reverse address matching strategy produced a highly accurate list of addresses.
|Offset in Meters
This article serves as a warning to treat encoded coordinates with the same level of confidentiality as the original list of addresses. For those who wish to communicate distributions without sacrificing privacy, dot density and choropleth maps may suffice. These methods maintain the transparency and intuitiveness of known generalization levels. Generalizing to larger regions or withholding values for areas with small numbers enhances anonymity.
For additional information, contact
Valerie Raybold Yakich
Department of Geography
University at Buffalo
105 Wilkeson Quad
Buffalo, New York 14261
About the Author
Valerie Raybold Yakich holds a bachelor's degree in computer science and environmental studies from Baylor University. After managing the GIS Department for the City of Waco, Texas, she later served as the primary GIS consultant for Ambit Technology to the State of South Dakota. Her multidisciplinary doctoral studies in GIS at the University at Buffalo are funded by an IGERT fellowship.
This project is supported by an IGERT award (DGE-9870668) from the National Science Foundation (NSF) and the University at Buffalo. Support from the NSF is gratefully acknowledged.