Powerful Spatial Statistics Tools in ArcGIS 9
GIS users in many application areasepidemiology, crime analysis, wildlife biologyrequire a higher level of information than can be obtained through map inspection or traditional spatial analysis.
|Learn more about any spatial statistics tool, view its Python script the PythonWin Interactive Window. |
In contrast to spatial analysis that involves familiar operations, such as spatial querying, buffering, and layering, spatial data analysis applies statistical techniques for measuring spatial autocorrelation, analyzing spatial patterns (i.e., clustering or dispersion), and assessing feature spatial distributions. Although less widely used, spatial data analysis (also known as spatial statistics) has just become much more accessible to GIS users.
The Spatial Statistics toolbox in ArcGIS 9 contains an arsenal of tools for analyzing the distribution of geographic features. Because spatial statistics are specialthey are not like other types of statistical analysessome traditional statistical tools are not suitable for spatial data analysis. Spatial statistics differ from traditional statistics in that space and spatial relationships are an integral and implicit component of analysis.
Consequently, for the purposes of spatial data analysis, it was preferable to develop spatially appropriate statistical tools in ArcGIS rather than integrate a traditional nonspatial statistics package with ArcGIS. These tools give GIS users access to advanced methods for solving spatial data analysis problems. Figure 1 lists and briefly describes each tool set.
Figure 1: An overview of the Spatial Statistics tool sets
||Evaluate whether features or attribute values form a clustered, uniform, or random pattern across a region.
||Identify statistically significant hot spots, cold spots, or spatial outliers.
|Measuring Geographic Distributions
||Determine the location of the center of the data, the shape and orientation of the data, and the degree to which features are dispersed.
||Reformat data or render analysis results.
Added Core Functionality
Many of these tools were originally provided as developer samples in ArcGIS 8, but with version 9, they are now included as part of core functionality. Like all the geoprocessing tools in ArcGIS 9, these tools can be run from dialog boxes, from the command line, or in models. Incorporating spatial statistics tools within the geoprocessing framework also makes it easier to extend these tools, create custom tools, and incorporate third party products. The framework uses a wizard that walks the user through the process of creating a custom tool. The appearance of dialog boxes used by a custom tool can be completely controlled using XML stylesheets.
Each tool set in the Spatial Statistics toolbox groups tools by function. The following sections describe the tools and types of analysis provided by each tool set.
Measuring patterns using statistics provides a more rigorous way of determining if a pattern does exist and comparing patterns for different distributions. Correctly applied, these tools supply a higher level of confidence in data that will be used to make important decisions or data that may require further research into the relationships between features that may be causing the pattern. The Analyzing Patterns tool set includes the Average Nearest Neighbor Distance, High/Low Concentration, and Spatial Autocorrelation tools.
For applications such as crime analysis or epidemiology, cluster mapping identifies areas where action must be taken. Depending on the phenomenon being studied, clusters could indicate a crime spree or a disease outbreak. These two toolsthe Hot Spot Analysis tool and the Cluster and Outlier Analysis tooldiffer from the tools in the Analyzing Patterns tool set because they allow the user to see the location and extent of clusters rather than producing a summary statistic.
Measuring Geographic Distributions
These five tools let the user calculate a value that represents a characteristic of the distribution such as its center, the shape and orientation of the data, or the dispersion of features. This information can be very useful for applications such as siting facilities in a central location or studying the spread of a disease or contaminant. This tool set contains the Central Feature, Directional Distribution, Linear Directional Mean, Mean Center, and Standard Distance tools.
Designed to complement tools in the other three tool sets, these five tools perform data conversion and rendering tasks. This tool set contains the Calculate Areas, Collect Events, Count Rendering, Export Feature Attributes to ASCII, and Z Score Rendering tools.
Be sure to open the ArcToolbox help files for the tools in the Spatial Statistics toolbox that are listed under the Geoprocessing section. These help files contain extensive information on each of the tools. In addition to descriptions of the methodology underlying each tool, there are sections on using each tool that include illustrations, usage tips, command line syntax, and scripting syntax.
Looking Under the Hood
The spatial statistics tools in ArcGIS were developed in Python. This free, open source scripting language has many advantages for users: readable code; support for object-oriented programming; and simple integration with C++, FORTRAN, and Java. Python ships with ArcGIS 9. The ArcGIS help topic "Writing Geoprocessing Scripts," listed in the Geoprocessing help section, provides a place to start learning about scripting in Python. Although numerous helpful Python Web sites exist, the official Web site for the Python language (www.python.org) is recommended.
The straightforward nature of Python makes it easy to examine how these scripts work. Writing these tools in Python also helps make spatial statistics more accessible to regular GIS users. It encourages the intelligent use of these tools by discouraging the "black box" approach to spatial statistics (i.e., inputting data and indiscriminately using the results with no understanding of the processes or any underlying constraints with regard to appropriateness or accuracy).
One of the goals the development team had for the spatial statistics tools was to start a conversation about spatial statistics so that, together, Esri and the user community could develop better tools and approaches and ask new questions about geographic data. The source code was provided to encourage users to learn from, modify, extend, and share these and other analysis tools. This approach has the added benefit of empowering users to work on issues in spatial statistics that have not been satisfactorily resolved with respect to messy, real-world data.
As with all GIS tools, obtaining valid results requires that users understand not only how the tools employed work and what alterations they make to the data but also the limitations of the data used in terms of accuracy, scale, and other aspects. Without attention to these factors, errors may propagate and cause unreliable or incorrect results.
Modifying the Tools
An easy way to extend the functionality of any spatial statistics tools is to modify the script associated with it. The utilities tools are good candidates for this kind of experimentation. To view the Python script for any tool, simply right-click on the tool in the ArcToolbox window and choose Edit from the context menu. This will invoke the PythonWin Interactive Window. PythonWin provides a Windows-based environment that allows the user to edit and reload files. When modifying a script, make sure to copy the script and modify the copy rather than the original tool script. Modifications can be as simple as changing the parameters of the script. The scripts are well commented, so opening and exploring these scripts will help familiarize the user with Python and how each tool works.
Visit the Esri Support Web site (support.esri.com) for spatial statistics tool updates and patches and to view related discussion forum responses. In addition to the extensive online help files, a new Esri Press book, Guide to GIS Analysis Volume II, provides a detailed discussion of the statistical analysis of geographic data and furnishes additional information on the processes used by the spatial statistics tools. Several excellent articles discussing spatial statistics can also be downloaded from the ArcGIS Geostatistical Analyst extension product page at the Esri Web site at www.esri.com/geostatisticalanalyst.