Leveraging Geoprocessing Functionality to Manage Enterprise Data
By Cynde Porter and Bob Earle, Sacramento County, California
While many of the tools used by GIS professionals appear flashy and futuristic, the day-to-day reality is that most of us are asked to perform a set of repetitive, time-consuming, and sometimes hefty tasks. While temporal three-dimensional modeling, real-time Web integration, and other techniques are practical and beneficial in many cases, the major portion of our job duties takes place in an enterprise setting.
Business practices at Sacramento County mandate that we publish our spatial as well as tabular data in a variety of formats. We maintain and support hundreds of spatial layers and are responsible for the publication of many that are essential to the business operations of various county departments. The most critical data layers include parcels, legal lots, subdivisions, proposed lots and subdivisions, regional street network, and districts (tax, fee, special).
Our primary land base is created and maintained using a gridded tile structure composed of more than 1,200 individual AutoCAD drawing files. After several needs analyses, we have determined that at least for the immediate future, it will be necessary to retain AutoCAD representations of our base data to support the business needs of our customers. From these source drawing files, we build and populate GIS data layers in shapefile and personal geodatabase format for dissemination to internal county agencies, regional partners, intranet and Internet Web applications, and external customers.
The original CAD data is converted using legacy ARC Macro Language (AML) scripts for the creation of the initial spatial features. To these spatial features, we append tabular data extracted from our Oracle Property Shared Database and departmental attribution maintained in Microsoft SQL Server. We process high volumes of data from many sources that are frequently updated or refreshed, then propagated into multiple configurations for distribution to the data consumers.
These processes are repetitive and time consuming but are necessary to satisfy customers who expect standardized products on a predictable and timely schedule. In addition to our own internal county departmental needs for data, in a regional data sharing environment, such as our Sacramento County GIS Cooperative, the necessity for a dynamic conversion capability, highly reliable output, and documented processing history becomes even more imperative. Other daily tasks—such as incorporation of early entry subdivisions into the parcel land base and street network, overlay analysis, and relational database population—have driven us to reexamine our internal GIS workflows to support more timely data dissemination.
With the advent of geoprocessing, Esri has provided a methodology for answering this challenge. These new toolsets offer a framework for automating many of the day-to-day tasks necessary to support a fully integrated enterprise-wide business operation. The resulting products are standardized and produce consistent output, which previously presented quite a challenge for us to achieve as part of a multiple step process. The functionality has allowed us to decrease the amount of time spent performing many of our scheduled tasks. One of our larger geoprocessing scripts has effectively cut staff time from a six-day ordeal to a mere few hours. In addition, we can now schedule these jobs to run during evening hours to minimize impact on other computer applications and server resources.
Typically, we employ ModelBuilder to design initial process chains. ModelBuilder is an interactive environment in ArcGIS 9 that provides a graphical modeling framework for designing and implementing geoprocessing models that can include tools, scripts, and data. The largest process chain model handles the basic monthly build tasks [PDF-1.03 MB, 1 page]. After establishing these basic workflows, we export to the model to Python to leverage its robust scripting environment and utilities.
Python scripting allows for batch processing, looping and conditional branching, messaging (error, informative, output), and string manipulation. We have found Python to be a user-friendly language with a minimal learning curve useful for programmers and nonprogrammers alike. As part of a small GIS shop, we do not have the luxury of employing a full-time application developer but have found the geoprocessing object model, used in tandem with Python scripting, to be a relatively simple and powerful solution for accomplishing our daily tasks.
While the bulk of the tabular data is maintained in fully normalized relational databases, often it is necessary to utilize some intelligent translation to denormalize this information for integration and packaging of various datasets designed to satisfy a myriad of diverse business-driven requirements. Normalized data is customarily not the fastest performer for Web queries and feature labeling. In addition, it is often necessary to concatenate labeling fields and join attribution from business and lookup tables.
Continued on page 2