Geoprocessing can be used to automate many aspects of data compilation, including implementing a geodatabase topology. In my previous post, I created a model that imports data into a file geodatabase and performs initial data cleanup processing. Now, I will use geoprocessing with Python scripting to build a topology on the imported data. Although many spatial integrity issues were resolved by running the model, a geodatabase topology can help check for and repair any remaining errors.
The dataset I am using is a feature class of parcel lot line boundaries that began as a CAD file full of topological inconsistencies, such as lines that overlapped or did not connect to other lines. Once I create the topology and am sure the lines are correct, I will build parcel polygons from the lines and introduce all the features into my production enterprise geodatabase.
Creating a topology in a script
To build topology in an automated manner, I am going to write a simple Python script, add it to a toolbox, and run it as a regular geoprocessing tool. Essentially, the script just performs the functions of the New Topology wizard, but without requiring my intervention. In fact, this script could be developed without any programming by creating a model to perform these tasks and exporting it from ModelBuilder as a Python script.
When scripting, the ArcPy site package allows Python to access and run any of the geoprocessing tools in the ArcGIS system toolboxes, including the topology tools. The Topology toolset in the Data Management toolbox contains all the tools I need to add a geodatabase topology to the line feature class. The import arcpy statement adds ArcPy to a script and is at the top of every ArcGIS Python script.
Before I start adding the tools to the script, I define variables for the paths to the feature dataset, feature class, and topology. Because I created folders to store my tools and data following the recommendations in A structure for sharing tools, I can make the script more portable by setting these paths in relation to the folder containing the script. Since the folder locations are not hard-coded to match the C: absolute path of my hard drive, the script should run on someone else’s machine regardless of the machine’s directory structure.
With the paths defined, I first use the Create Topology tool to add a topology to the feature dataset. The syntax for the tool in my script is arcpy.CreateTopology_management(featureDataset, topologyName, “”), where CreateTopology is the name of the Create Topology tool, _management is the toolbox in which it resides, featureDataset is a variable I defined earlier representing the path to the feature dataset, and topologyName is a variable for the name of the topology. When working with topology, it is recommended to use the default cluster tolerance, which is the distance in which vertices are determined to be coincident. Since I want ArcGIS to calculate the cluster tolerance, I left the value blank as “”instead of supplying one.
Although that function creates a topology, it is currently empty and has no classes or rules in it. I can use the Add Feature Class To Topologytool to add my line feature class with the statement: arcpy.AddFeatureClassToTopology_management(topology, featureClass, “1″, “1″). If the topology contained multiple feature classes, I can set ranks so the feature class with the highest accuracy is not adjusted to match vertices in a feature class which is known to be less accurate. However, since I am using only one feature class, I leave the rank parameters as “1″.
Next, I set which topology rules to include by calling the Add Rule To Topology tool. When choosing which rules to add to a topology, there are a few rules that many editors commonly add to every topology, such as the line rule for Must Not Have Dangles. In the script, this rule is coded as arcpy.AddRuleToTopology_management(topology, “Must Not Have Dangles (Line)”, featureClass, “”, “”, “”). Because the rule only applies to one feature class that does not have subtypes and is not a rule between two feature classes or subtypes, the other parameters are left blank. I also want to make sure that the lines do not overlap or intersect themselves, so I include the Must Not Overlap and Must Not Self-Intersect rules as well. If I need to add more rules later, I can run the Add Rule to Topology tool or use the topology’s Properties dialog box in the Catalog window.
Once all the feature classes and rules have been added, my script validates the topology. The Validate Topology tool identifies features that share geometry, inserts common vertices into features that share geometry, then performs integrity checks to identify any violations of the rules that I defined for the topology. Since the topology has never been validated before, I am going to validate the entire extent of the data with the syntax, arcpy.ValidateTopology_management(topology, “Full_Extent”). However, when working in ArcMap, validating the visible extent of the map instead of the full extent limits the area to be validated and can be useful for very large datasets that take a long time to validate.
Finding and fixing topology errors
After the script runs, I add to ArcMap the resulting topology so I can inspect the results and fix any errors using the ArcMap editing tools. However, with all the previous automated QA work from my Import and Clean Lines model, the remaining manual edits are minimal in comparison to what they could have been without running it first.
The errors identified with topology are indicated by the orange-colored squares. Almost all of these are dangle errors that could not be fixed by the original model since they exceeded the tolerance value for the Extend, Trim, or Snap tools. The topology found only two Must Not Overlap errors, which I can fix by deleting one of the overlapping features. There are no violations of the Must Not Self-Intersect rule, indicating that the lines were split properly in the model. Using the editing tools on the Topology toolbar, such as the Error Inspector and Fix Topology Error tool, I can review each error to determine if the built-in topology fixes can be used or if the lines should be edited manually to resolve the topology error.
In some cases, the topology error may need to be marked as an exception, which is a valid violation of a topology rule. One of the most common examples of exceptions to the Must Not Have Dangles rule is a cul-de-sac road, which are dead ends that do not connect to other roads. However, when working with parcel lot lines, there are fewer scenarios that are valid exceptions. I do have some lines at the edges of the dataset that do not connect to other lines. I can either mark these as exceptions or choose to delete the features, depending on whether these features are supposed to connect to existing features in my enterprise database.
If an edit is made to correct a topology error, I have to validate the topology again to make sure the error no longer exists. After I perform a visual inspection and fix all the remaining topology errors, I can create new polygons representing landownership parcels using the geometry of the lines. If I attempted to create polygons from lines that do not connect to each other properly, either no polygons would be created or one large polygon would result where there should actually be two polygons. After creating the new polygons, I add the polygon feature class to the topology and check for any gaps or overlaps and make sure the parcel polygon boundaries are always coincident with the lot lines.
After implementing the topology and making edits in ArcMap, the lines and the polygons created from them meet standards for our spatial data. I can now introduce the features into the production enterprise geodatabase.
While a script tool or model may take time to set up initially, in the long run, it is quicker to automate data compilation tasks through geoprocessing whenever possible. I can run a tool as needed and re-run it later with different parameters and tolerances or apply it to other datasets. Scripts are particularly useful because they can be run at specified times as a scheduled task in Windows. For example, I could combine the tools presented in these blog entries into a Python script that imports a dataset into a geodatabase, processes it, and implements topology. If I set the script to run automatically in the evening after working hours, I am ready to start editing on clean data when I come into the office the next morning.
For more information:
The sample tools and data can be downloaded from the Editing Labs group on ArcGIS.com. An ArcInfo license in required to run the tools.
Content provided by Rhonda (Editing Team)