Dicing Godzillas (features with too many vertices)

Vertices are the x,y coordinate pairs that define the shape of a feature, and the size of an individual feature (polygon, polyline, or multipoint) is defined its number of vertices. When a single feature has a million or so vertices, it can cause out-of-memory errors and, in some cases, a system crash – never a good thing. We call such gargantuan features ‘Godzillas’ because they wreak havoc on your computer’s resources. Godzillas are usually long and crenulated coastlines or street casings digitized at a high degree of accuracy.

Operations that are particularly vulnerable to Godzillas are:

Godzillas typically raise the geoprocessing error codes 000426 (Out Of Memory)010005 (Unable to allocate memory)999998 (Something unexpected caused the tool to fail), and 999999 (Something unexpected caused the tool to fail).

How Godzillas are created

How many vertices define a Godzilla?

Unfortunately there is no simple answer since it depends entirely on the available memory your machine has–more memory means more vertices per feature can be processed. You can increase available memory by closing down all other applications other than ArcGIS , turning off background processing as described in the Desktop help topic Foreground and background processing, and re-running your operation. But this is a one-time solution: what you really need to do is get rid of the Godzilla.

Finding Godzillas

If you think you’ve got a Godzilla, your first task is to count the number of vertices for every feature in your feature class. The recipe for this is:

  1. Use the Add Field tool to add a new field named VERTEXCOUNT. The field type is LONG.
  2. Next, use the Calculate Field tool with this expression: !shape!.pointcount, as illustrated below.
  3. After Calculate Field runs, open the feature class attribute table and sort on the VERTEXCOUNT column, or use the Summary Statistics tool to find the MAX of VERTEXCOUNT

The VERTEXCOUNT field you added and calculated is not automatically recalculated or maintained. Anytime the geometry of features change, you’ll have to run Calculate Field again. Of course, if you no longer need the VERTEXCOUNT field, use Delete Field to remove it.

Simplifying your data

The first question you should ask is if you really need all those vertices to describe the shape of your Godzilla. If you don’t need all the vertices, use the Simplify Polygon or Simplify Line tool. These tools weed out unnecessary vertices. After running Simplify Polygon or Simplify Line, recalculate VERTEXCOUNT by running Calculate Field again. If the vertex count does not drop dramatically (or the tool fails to run because of the Godzilla), you’ll need to dice your Godzilla as described next.

Using the Dice tool

The Dice tool takes input features and a vertex limit and outputs a new feature class with diced features, as illustrated below. The Dice tool works with multipoints, lines, and polygons.

Choosing a vertex limit for the Dice tool

Obviously, the vertex limit value for the Dice tool needs to be less than the maximum of VERTEXCOUNT. Smaller vertex limits create more features, but this is hardly ever a concern since adding more features is rarely a computational issue—it’s the Godzilla that’s the problem. Here are some suggestions:

Apportioning attributes

When using Dice (or any overlay tool found in the Overlay toolset), all attribute values from the input feature class are carried across to the output feature class. If any of the input attributes contain values that are apportioned by area (such as a population count), you’ll want these attribute values to be apportioned among the new features created by Dice. To apportion attributes, use the Make Feature Layer tool and check “Use Ratio Policy” for any attribute that needs to be apportioned by area, and use the output of Make Feature Layer as the input to Dice. The ratio is based on the ratio in which the original geometry is divided. If the geometry is divided in half, each new feature’s attribute gets one-half of the value of the original object’s attribute.

Maintaining parentage

The Dice tool does not maintain the parent Object ID of the original feature, so unless you have a unique ID field, you’ll have no way of knowing the original feature from which the new feature was created. Therefore, you should add a unique ID field before Dice is run. If you don’t already have a unique ID field, do this:

Geometry errors

The Check Geometry tool identifies possible geometry errors such as null coordinates, empty rings, and self intersections. Godzillas, because of their size, are error suspects. The Dice tool cannot check the geometry of the input features since the operation may fail on the Godzilla, so any errors in the input will be written to the output. If you haven’t run Check Geometry prior to running Dice, you should run Check Geometry on the output of Dice.

Geometry changes after running Dice

Multipart Points

This post was contributed by Ken Hartling, a product engineer on the geoprocessing team

Next Article

Use Arcade geometry functions with FeatureSets to provide spatial context

Read this article