ArcGIS Blog

Developers

ArcGIS Pro

Performance of the new spatial_filter parameter for Search Cursors

By Kimberly Saballett

In ArcGIS Pro 3.2, three new parameters were added to the arcpy.da.SearchCursors to allow for searching a feature class table by passing in a geometry to this python method. The new spatial_filter parameter is the geometry object by which features are filtered. The spatial_relationship parameter is the relationship between the input query and the query geometry. The default is INTERSECTS, which is what we will use in these tests. The third parameter is search_order, which can be either ATTRIBUTEFIRST or SPATIALFIRST (default). This parameter is only applicable for data hosted in an enterprise geodatabase.

We will analyze the performance of the new spatial filter of arcpy.da.SearchCursor for limiting records by comparing the results of testing with point, line, and polygon input geometries across multiple data sources including a file geodatabase, an enterprise geodatabase, and an ArcGIS Online feature service. To do this we will use three different approaches: the new spatial filter on the Search Cursor, a client-side relationship operation after performing the Search Cursor, and Select By Location before the Search Cursor.

Summary of the data

We will use three different test cases to demonstrate that performance is improved across varying geometries for the input and features to be filtered.

 

A map with a blue polyline representing the Mississippi River against gray polygons representing the US Counties.

Case 1: US counties that intersect the Mississippi River

Short name: US counties by Mississippi River

Target: US counties, 3,140 polygon features (gray polygons)

Selecting geometry: Mississippi River (blue line)

Number of records selected: 114

Purpose: The selecting feature is quite large in contrast to the entire extent of the data.

 

A map containing one hundred thousand gray points with only a blue polygon of Italy visible over them.

Case 2: Select random points that intersect with Italy

Short name: 100k pts by Italy

Target: Random points, 100k points (gray points)

Selecting geometry: Italy (blue polygon)

Number of records selected: 947

Purpose: A large volume of small geometry objects (points) against a single polygon where only 1 percent of the input features get through the filter.

 

A map of the world countries in gray polygons and a single blue point over Ukraine.

Case 3: Select world countries that intersect a point

Short name: World countries by 1 point

Target: World countries, 248 polygons (gray polygons)

Selecting geometry: A single point (blue point)

Number of records selected: 1

Purpose: The simplest query of searching for what is at a specific location.

 

Spatial filtering approaches

There are three approaches to filtering and retrieving data that we will contrast in this analysis. The goal for each of them is to retrieve the object ID field (OID) of the spatially filtered subset of the input records.

All three approaches produce the same set of features for each individual case.

Approach name: Cursor with spatial filter

Description: The new parameter to arcpy.da.SearchCursor at ArcGIS Pro 3.2 where filtering is done in the database.

Approach name: Cursor client-side relational operator

Description: Use the regular arcpy.da.SearchCursor, fetching all records and then using the arcpy.Shape.Disjoint() relational operator to select the intersecting records.

Approach name: Select by location

Description: Perform SelectLayerByLocation on the dataset (relationship=INTERSECT), and use arcpy.da.SearchCursor on the layer to retrieve only the selected features.

Note that the layer is created once and that cost is not included in the performance timing.

Results

The following results demonstrate that the new approach with the spatial_filter parameter on arcpy.da.SearchCursor does well in comparison to the other two approaches.

Each chart for the individual test cases contains a bar for each spatial filtering approach that represents the range of processing times in seconds across the three data sources. The bottom of the bar represented the fastest processing time while the top represents the slowest processing time. The approaches are color-coded to gray for the search cursor with spatial filter, blue for the search cursor with the client-side relational operator, and yellow for the select layer by location before the search cursor.

The caveat to these results is that performance will vary with different datasets. Also a variety of factors can affect performance physical hardware, system configuration, network performance and network proximity, database tuning and load. Many of these factors can be particularly impactful for processing remote data sources. The performance times used in the analysis below provide some observed numbers but these comparisons should not be taken as absolute or definitive.

Interpretation of the US counties By Mississippi River test case:

This case is best at showing the performance improvement when using the new parameter to limit larger polygons from being fetched from the data source where they then require being turned into arcpy.Polygon objects for comparison. As you can see in the graph below, the search cursor with the spatial filter performed quicker based on its range of processing times with all three data sources compared to either of the other two approaches. The lower end of the range for the processing times for the client-side relational operator was faster than one of the tests for the spatial filter. However for the same data source it was slower than using the cursor with spatial filter.

A bar chart representing the range of processing times for the three approaches for each data source for the US Counties by Mississippi data set.

Interpretation of the 100k points by Italy test case

The new approach for the cursor with the spatialĀ­_filter parameter was the fastest across all tests in this set as well. Even the fastest time for the client-side relational operator was slower than the slowest processing time when using spatial_filter as evidenced by the non-overlapping ranges. Select by location is similar in range in the below chart but if you individually compare by data source then the cursor with spatial filter was faster for all three and has an overall narrower range.

A bar chart representing the range of processing times for the three approaches for each data source for the 100k Points by Italy data set.

Interpretation of the World countries by 1 point test case

In this test case, you can see that the range of processing times for the cursor with spatial filter is still narrower than the client-side relational operator and faster overall. The ranges for select by location and cursor with spatial filter do not even overlap. This indicates that the slowest processing time for the spatial filter was much faster than the quickest processing time for the select by location.

A bar chart representing the range of processing times for the three approaches for each data source for the World Countries by one point data set.

Conclusion

As demonstrated through this series of tests (27 individual tests overall), the new arcpy.da.SearchCursor spatial_filter parameter performs better than the other approaches for all three types of selecting geometry. Across all three test cases you can notice that the range of processing times is generally much smaller than that of the other approaches, which can vary quite a lot dependent on the data source.

The fact that the spatial_filter approach relied on the database to perform the spatial query and avoids pulling the entire geometry from the database gives it a distinctive performance advantage particularly with remote data sources, but as shown above this new approach performs extremely well with all cases. So with scripts or workflows where performance is a consideration the new spatial_filter parameter may be a good option to evaluate.

Share this article

Subscribe
Notify of
0 Comments
Oldest
Newest
Inline Feedbacks
View all comments