{"id":2933391,"date":"2025-08-25T09:11:01","date_gmt":"2025-08-25T16:11:01","guid":{"rendered":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=2933391"},"modified":"2025-08-25T09:11:01","modified_gmt":"2025-08-25T16:11:01","slug":"three-ways-to-improve-your-geocoding-performance","status":"publish","type":"blog","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","title":{"rendered":"Three ways to improve geocoding performance"},"author":342532,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","format":"standard","meta":{"_acf_changed":false,"_searchwp_excluded":""},"categories":[23341],"tags":[586931,765912,25091],"industry":[],"product":[765842],"class_list":["post-2933391","blog","type-blog","status-publish","format-standard","hentry","category-analytics","tag-apache-spark","tag-arcgis-geoanalytics-engine","tag-geocoding","product-geoanalytics-engine"],"acf":{"authors":[{"ID":303712,"user_firstname":"Xirui","user_lastname":"Xu","nickname":"Xirui Xu","user_nicename":"xiruixu","display_name":"Xirui Xu","user_email":"xiruixu@esri.com","user_url":"","user_registered":"2022-03-08 15:27:53","user_description":"Product Engineer for ArcGIS GeoAnalytics","user_avatar":"<img data-del=\"avatar\" src='https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/Xirui_Xu-213x200.jpg' class='avatar pp-user-avatar avatar-96 photo ' height='96' width='96'\/>"},{"ID":394531,"user_firstname":"Ben","user_lastname":"Burnett","nickname":"Ben Burnett","user_nicename":"bburnett","display_name":"Ben Burnett","user_email":"bburnett@esri.com","user_url":"","user_registered":"2025-08-19 15:46:42","user_description":"","user_avatar":"<img alt='' src='https:\/\/secure.gravatar.com\/avatar\/4ecc779d278bc912c7cdfc86ca7bf7ea2c0e19c001060d480f0e875f800cfd05?s=96&#038;d=blank&#038;r=g' srcset='https:\/\/secure.gravatar.com\/avatar\/4ecc779d278bc912c7cdfc86ca7bf7ea2c0e19c001060d480f0e875f800cfd05?s=192&#038;d=blank&#038;r=g 2x' class='avatar avatar-96 photo' height='96' width='96' loading='lazy' decoding='async'\/>"}],"short_description":"Tips and tricks to improve geocoding performance by optimizing your Apache Spark cluster.","flexible_content":[{"acf_fc_layout":"content","content":"<p>Geocoding is the process of converting addresses into geographic coordinates and is a common task in many data processing pipelines. ArcGIS GeoAnalytics Engine includes geocoding tools to take advantage of the scalability of Apache Spark to process large volumes of addresses and locations quickly. In this post, we\u2019ll introduce a few tips and tricks on what to look for in your Spark cluster to improve geocoding performance using <a href=\"https:\/\/www.esri.com\/en-us\/arcgis\/products\/arcgis-geoanalytics-engine\/overview?aduc=PublicRelations&amp;sf_id=7015x000001RnGtAAK&amp;adut=8-2025&amp;aduco=three-ways-improve-geocoding&amp;aduca=CRAArcGISGeoAnalyticsEngine&amp;utm_id=7015x000001RnGtAAK&amp;adum=Blog&amp;utm_campaign=CRAArcGISGeoAnalyticsEngine&amp;utm_source=PublicRelations&amp;utm_medium=Blog\" target=\"_blank\" rel=\"noopener\">GeoAnalytics Engine<\/a>.<\/p>\n"},{"acf_fc_layout":"content","content":"<h2><strong>#1 Partitioning: The Key to Improving Geocoding Performance<\/strong><\/h2>\n<p>In Spark, data partitioning plays a crucial role in geocode performance. To understand why, it\u2019s helpful to have some background on how Spark processes a job.<\/p>\n<h3>Spark Fundamentals<\/h3>\n"},{"acf_fc_layout":"image","image":{"ID":2933401,"id":2933401,"title":"how_geocoding_works_in_spark","filename":"how_geocoding_works_in_spark.png","filesize":42852,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\/how_geocoding_works_in_spark","alt":"Diagram of how Geocoding operation works in Spark","author":"342532","description":"","caption":"","name":"how_geocoding_works_in_spark","status":"inherit","uploaded_to":2933391,"date":"2025-08-19 15:05:04","modified":"2025-08-19 15:05:18","menu_order":0,"mime_type":"image\/png","type":"image","subtype":"png","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":624,"height":362,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark-213x200.png","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","medium-width":450,"medium-height":261,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","medium_large-width":624,"medium_large-height":362,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","large-width":624,"large-height":362,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","1536x1536-width":624,"1536x1536-height":362,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","2048x2048-width":624,"2048x2048-height":362,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","card_image-width":624,"card_image-height":362,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/how_geocoding_works_in_spark.png","wide_image-width":624,"wide_image-height":362}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>When Spark processes a job, it breaks the work into smaller units called <strong>tasks<\/strong>. A task is the execution of an operation on a single chunk of data\u2014for example, geocoding a set of addresses. Each chunk of data is called a <strong>partition<\/strong>, which represents a subset of the input data\u2014such as a portion of a CSV or Parquet file.<\/p>\n<p>Spark distributes tasks across the worker nodes in a cluster. Each <strong>worker node<\/strong> runs processes called <strong>executors<\/strong>, which process tasks in parallel\u2014typically one task per core. This means that partitioning determines the number of tasks and, in turn, the level of parallelism your cluster can achieve.<\/p>\n<p>For more information on Spark architecture, see the Spark documentation about <a href=\"https:\/\/spark.apache.org\/docs\/latest\/cluster-overview.html\" target=\"_blank\" rel=\"noopener\">cluster architecture<\/a>.<\/p>\n"},{"acf_fc_layout":"content","content":"<h3>What this means for geocoding<\/h3>\n<p>Spark will try to optimally partition the input data, but we recommend checking and enforcing partitioning to ensure that there are <strong>2-4 partitions per core<\/strong> in your cluster for the best geocoding performance.<\/p>\n<h3>How to Check and Change Partitions<\/h3>\n<ul>\n<li><strong>Checking Partitions<\/strong>: You can check the number of partitions in a DataFrame using the following code:<\/li>\n<\/ul>\n"},{"acf_fc_layout":"content","content":"<pre><code>\r\ndf.rdd.getNumPartitions()\r\n\r\n<\/code><\/pre>\n"},{"acf_fc_layout":"content","content":"<ul>\n<li><strong>Repartitioning DataFrames<\/strong>: Repartitioning can help balance the workload more efficiently across executors and workers. If you want to optimize partitioning, you can repartition your DataFrame using the <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/pyspark.sql\/api\/pyspark.sql.DataFrame.repartition.html\" target=\"_blank\" rel=\"noopener\">.repartition() <\/a>method. You can use <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/api\/pyspark.SparkContext.defaultParallelism.html\" target=\"_blank\" rel=\"noopener\">sparkContext.defaultParalleism<\/a> to estimate a suitable number of partitions.<\/li>\n<\/ul>\n"},{"acf_fc_layout":"content","content":"<pre><code>\r\n<span style=\"color: #6a737d\"># Get the current number of partitions in the DataFrame<\/span>\r\nnum_partitions = df.rdd.getNumPartitions()\r\n\r\n<span style=\"color: #6a737d\"># Get the number of available cores in the Spark cluster <\/span>\r\navailable_cores = spark.sparkContext.defaultParallelism\r\n\r\n<span style=\"color: #6a737d\"># Check if the number of partitions is less than twice the available cores<\/span>\r\n<span style=\"color: #d73a49\">if<\/span> num_partitions &lt; (available_cores * <span style=\"color: #005cc5\">2<\/span>):\r\n    <span style=\"color: #6a737d\"># Set the minimum target number of partitions to 2x the available cores<\/span>\r\n    min_target_partitions = available_cores *<span style=\"color: #005cc5\">2<\/span> \t\r\n    \r\n    <span style=\"color: #6a737d\"># Repartition the DataFrame to improve parallelism<\/span>\r\n    df = df.repartition(min_target_partitions)\r\n\r\n\r\n<\/code><\/pre>\n"},{"acf_fc_layout":"content","content":"<h3>Warning: Available cores in an autoscaling cluster<\/h3>\n<p>Spark has built-in mechanisms for adjusting the number of executors and parallelism during a job. However, keep in mind that <strong>autoscaling <\/strong>may not always result in the most efficient partitioning strategy. <strong>Default parallelism<\/strong> is often set to the number of cores available across all executors when the spark session is initialized.<\/p>\n<p>For example, if you have an autoscaling cluster started with 3 nodes and later auto-scaled to 10 nodes, <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/api\/pyspark.SparkContext.defaultParallelism.html\" target=\"_blank\" rel=\"noopener\">spark.sparkContext.defaultParalleism<\/a>\u00a0will only return the number of cores available within the 3 nodes when Spark application was started.<\/p>\n<p>To avoid this, we recommend repartitioning your input data to <strong>2\u20134x the number of cores in the cluster when fully scaled<\/strong>. While this might introduce slight overhead when not scaled up, it ensures efficient resource utilization as the cluster reaches maximum capacity.<\/p>\n<p>For instance, in the example above with a maximum capacity of 10 worker nodes, 40 cores in total (each node has 4 cores), you should at least repartition the input DataFrame to 80 (2*10 worker node*4 cores\/node).<\/p>\n"},{"acf_fc_layout":"content","content":"<h2><strong>#2 Locator Proximity: Keeping Data Local<\/strong><\/h2>\n<p>When working with <a href=\"https:\/\/developers.arcgis.com\/geoanalytics\/core-concepts\/geocoding\/?aduc=PublicRelations&amp;sf_id=7015x000001RnGtAAK&amp;adut=8-2025&amp;aduco=three-ways-improve-geocoding&amp;aduca=CRAArcGISGeoAnalyticsEngine&amp;utm_id=7015x000001RnGtAAK&amp;adum=Blog&amp;utm_campaign=CRAArcGISGeoAnalyticsEngine&amp;utm_source=PublicRelations&amp;utm_medium=Blog\" target=\"_blank\" rel=\"noopener\">geocoding<\/a> in GeoAnalytics Engine, <strong>keeping locator files co-located with the Spark cluster<\/strong> is another performance factor to keep in mind. Ideally, geocoding tools should operate with locators stored locally, either on disk or in memory, to avoid the performance penalty of network calls.<\/p>\n<ul>\n<li><strong>Copy locator files from Cloud storage to locally on disk using init scripts<\/strong>: You can copy the locator files to each worker node in your Spark cluster using an init script. This needs to be done at the creation step of your Spark cluster.<\/li>\n<li><strong>addFile<\/strong>: You can also leverage the <a href=\"https:\/\/spark.apache.org\/docs\/latest\/api\/python\/reference\/api\/pyspark.SparkContext.addFile.html\" target=\"_blank\" rel=\"noopener\">sc.addFile<\/a> function to add the locator file to each worker after Spark is initialized.<\/li>\n<\/ul>\n<p>For more detailed examples, see the <a href=\"https:\/\/developers.arcgis.com\/geoanalytics\/install\/locator_network_dataset_setup\/?aduc=PublicRelations&amp;sf_id=7015x000001RnGtAAK&amp;adut=8-2025&amp;aduco=three-ways-improve-geocoding&amp;aduca=CRAArcGISGeoAnalyticsEngine&amp;utm_id=7015x000001RnGtAAK&amp;adum=Blog&amp;utm_campaign=CRAArcGISGeoAnalyticsEngine&amp;utm_source=PublicRelations&amp;utm_medium=Blog\" target=\"_blank\" rel=\"noopener\">GeoAnalytics documentation on Locator and Network Dataset Setup.<\/a><\/p>\n"},{"acf_fc_layout":"content","content":"<h2><strong>#3 Worker Memory Size<\/strong><\/h2>\n<p>Another key factor in improving geocoding performance is optimizing the<strong> worker memory size<\/strong> relative to the size of the locator file. Since reading from the locator is often the bottleneck for a geocode operation, GeoAnalytics Engine tries to load the locator into memory using a process called <strong>memory mapping<\/strong>. This speeds up the geocode because it is much faster to read from memory than from disk.<\/p>\n<p>To ensure that the gains from this optimization are possible, it is important that the worker node has enough available memory to store the locator.<\/p>\n<h3>Suggested Memory Formula<\/h3>\n<p>For each worker, the recommended minimum memory size for geocoding operations is:<\/p>\n<p>Recommended Minimum Memory = Locator Size + (Locator Size * Number of Cores on each Worker \/ 20)<\/p>\n<p>For example, on Azure Databricks using the <a href=\"https:\/\/www.esri.com\/en-us\/arcgis\/products\/arcgis-streetmap-premium\/overview?aduc=PublicRelations&amp;sf_id=7015x000001RnGtAAK&amp;adut=8-2025&amp;aduco=three-ways-improve-geocoding&amp;aduca=CRAArcGISGeoAnalyticsEngine&amp;utm_id=7015x000001RnGtAAK&amp;adum=Blog&amp;utm_campaign=CRAArcGISGeoAnalyticsEngine&amp;utm_source=PublicRelations&amp;utm_medium=Blog\" target=\"_blank\" rel=\"noopener\">ArcGIS StreetMap Premium<\/a> North America Locator (which is around 20gb), you should choose a 4 cores worker node with at least 24gb memory.<\/p>\n"},{"acf_fc_layout":"content","content":"<h2><strong>Other performance considerations<\/strong><\/h2>\n<p>In addition to tuning your Spark environment, it is important to also consider the impact of the cleanliness of your address data, the extent of your analysis, as well as the level of resolution needed for the geocodes.<\/p>\n<p><strong>Address cleanliness.<\/strong> You will likely see a slowdown in performance with address strings that include misspellings, missing information (e.g., no street directionality like \u201cnortheast\u201d, or no street name suffix like \u201croad\u201d or \u201cstreet\u201d or \u201cblvd\u201d), or use of non-standard abbreviations for street name, state, or country, etc.<\/p>\n<p><strong>Extent of analysis.<\/strong> When selecting a locator, make sure its spatial extent covers that of your input addresses. For example, if your dataset primarily contains addresses in Boston, it&#8217;s best to use a locator focused on that region. For example a city- or state-scaled locator.<\/p>\n<p>In some cases, using a locator with a broader spatial extent will be needed \u2014such as a U.S.-wide locator for a dataset mostly centered in Boston but containing a few outliers addresses elsewhere along the East Coast. However, keep in mind that using a locator with a significantly larger spatial extent than necessary may increase processing time, as it takes longer to search through the larger locator file to find matches.<\/p>\n<p><strong>Level of resolution.<\/strong> Geocoding speed will generally be quicker for more generalized locations, such as returning a geocoded point location at the city or postal code level, versus at the address \/ parcel level.<\/p>\n<p><strong>Input Data Formats.<\/strong> Consider storing large-volume string type addresses in columnar storage formats such as Parquet and ORC, as they support efficient compression and predicate pushdown, and generally offer significantly faster read performance compared to row-based formats like CSV or JSON. Additionally, large uncompressed text files can consume excessive network and disk resources, leading to increased I\/O overhead and slower processing in Spark.<\/p>\n"},{"acf_fc_layout":"content","content":"<h2><strong>Conclusion<\/strong><\/h2>\n<p>Improving geocoding performance in ArcGIS GeoAnalytics Engine requires a thoughtful approach to Spark configuration, data partitioning, memory management, and locator placement. Testing and iterating based on your data and infrastructure will yield the best results, but by applying the best practices outlined in this post\u2014 tuning partition counts, co-locating locator files, and ensuring sufficient memory on each worker\u2014you can significantly enhance the speed and efficiency of your geocoding workflows. For instance, we tested a dataset of 125 million US building addresses based on the <a href=\"https:\/\/www.arcgis.com\/home\/item.html?id=bb69f10baf334d4c935a0fb23d758f38\" target=\"_blank\" rel=\"noopener\">Microsoft building footprint dataset<\/a> with various portioning schemes. Without repartitioning, the Geocode tool takes about 4.38 hours to finish on a cluster with 400 cores, and with repartitioning it only takes 2.87 hours to complete. This adjustment within the GeoAnalytics Engine environment makes a significant difference in performance.<\/p>\n<p>We hope these details on geocoding have been helpful for your analytic workflows!\u00a0 We\u2019re excited to hear about what you\u2019re doing with Geocoding tools in GeoAnalytics Engine. Please feel free to provide feedback or ask questions in the comment section below.<\/p>\n"}],"related_articles":[{"ID":2522242,"post_author":"342532","post_date":"2024-11-13 08:00:32","post_date_gmt":"2024-11-13 16:00:32","post_content":"","post_title":"How JAKALA leverages ArcGIS GeoAnalytics Engine to identify retail mobility patterns","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"how-jakala-leverages-arcgis-geoanalytics-engine-to-identify-retail-mobility-patterns","to_ping":"","pinged":"","post_modified":"2024-11-04 16:44:42","post_modified_gmt":"2024-11-05 00:44:42","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=2522242","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"},{"ID":1611342,"post_author":"6481","post_date":"2022-06-23 06:00:48","post_date_gmt":"2022-06-23 13:00:48","post_content":"","post_title":"ArcGIS GeoAnalytics Engine:  Big Data Gets a Spatial Upgrade","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"arcgis-geoanalytics-engine-big-data-gets-a-spatial-upgrade","to_ping":"","pinged":"","post_modified":"2024-01-03 12:57:01","post_modified_gmt":"2024-01-03 20:57:01","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=1611342","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"},{"ID":1789392,"post_author":"323652","post_date":"2022-12-08 16:05:57","post_date_gmt":"2022-12-09 00:05:57","post_content":"","post_title":"ArcGIS GeoAnalytics Engine in Databricks: Scalable Geospatial Analysis in a Data Science Workflow","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"arcgis-geoanalytics-engine-in-databricks-scalable-geospatial-analysis-in-a-data-science-workflow","to_ping":"","pinged":"","post_modified":"2022-12-15 10:21:36","post_modified_gmt":"2022-12-15 18:21:36","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=1789392","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"},{"ID":1662692,"post_author":"312882","post_date":"2022-07-13 13:27:44","post_date_gmt":"2022-07-13 20:27:44","post_content":"","post_title":"Big Data Analytics with Amazon EMR and Esri\u2019s ArcGIS GeoAnalytics Engine","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"big-data-analytics-with-amazon-emr-and-esris-arcgis-geoanalytics-engine","to_ping":"","pinged":"","post_modified":"2022-07-13 14:08:52","post_modified_gmt":"2022-07-13 21:08:52","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=1662692","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"},{"ID":2789462,"post_author":"323502","post_date":"2025-05-20 08:03:29","post_date_gmt":"2025-05-20 15:03:29","post_content":"","post_title":"Solving Big Data Geoanalytics Challenges with ArcGIS GeoAnalytics Engine in AWS Glue","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"solving-big-data-geoanalytics-challenges-with-arcgis-geoanalytics-engine-in-aws-glue","to_ping":"","pinged":"","post_modified":"2025-05-16 13:53:45","post_modified_gmt":"2025-05-16 20:53:45","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=2789462","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"},{"ID":2796802,"post_author":"342532","post_date":"2025-06-24 07:12:46","post_date_gmt":"2025-06-24 14:12:46","post_content":"","post_title":"What\u2019s new in ArcGIS GeoAnalytics Engine 1.6","post_excerpt":"","post_status":"publish","comment_status":"open","ping_status":"closed","post_password":"","post_name":"whats-new-in-arcgis-geoanalytics-engine-1-6","to_ping":"","pinged":"","post_modified":"2025-06-24 08:49:06","post_modified_gmt":"2025-06-24 15:49:06","post_content_filtered":"","post_parent":0,"guid":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=2796802","menu_order":0,"post_type":"blog","post_mime_type":"","comment_count":"0","filter":"raw"}],"show_article_image":false,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/geocoding-search-small-banner.png","wide_image":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Three Ways to Improve Geocoding Performance<\/title>\n<meta name=\"description\" content=\"Learn how to optimize your Apache Spark cluster to improve geocoding performance.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Three ways to improve geocoding performance\" \/>\n<meta property=\"og:description\" content=\"Learn how to optimize your Apache Spark cluster to improve geocoding performance.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\" \/>\n<meta property=\"og:site_name\" content=\"ArcGIS Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/esrigis\/\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@ESRI\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\"},\"author\":{\"name\":\"Sarah Battersby\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/5a84e904a4a87576be52b7d2c7dd5615\"},\"headline\":\"Three ways to improve geocoding performance\",\"datePublished\":\"2025-08-25T16:11:01+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\"},\"wordCount\":6,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"keywords\":[\"Apache Spark\",\"ArcGIS GeoAnalytics Engine\",\"Geocoding\"],\"articleSection\":[\"Analytics\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\",\"name\":\"Three Ways to Improve Geocoding Performance\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\"},\"datePublished\":\"2025-08-25T16:11:01+00:00\",\"description\":\"Learn how to optimize your Apache Spark cluster to improve geocoding performance.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.esri.com\/arcgis-blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Three ways to improve geocoding performance\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"name\":\"ArcGIS Blog\",\"description\":\"Get insider info from Esri product teams\",\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\",\"name\":\"Esri\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"width\":400,\"height\":400,\"caption\":\"Esri\"},\"image\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/esrigis\/\",\"https:\/\/x.com\/ESRI\",\"https:\/\/www.linkedin.com\/company\/5311\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/5a84e904a4a87576be52b7d2c7dd5615\",\"name\":\"Sarah Battersby\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/07\/Sarah_Battersby-213x200.png\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/07\/Sarah_Battersby-213x200.png\",\"caption\":\"Sarah Battersby\"},\"description\":\"Sarah is a Product Manager for ArcGIS GeoAnalytics Engine. She has a PhD in Geography \/ Cognitive Science from UC Santa Barbara, and enjoys finding ways to make spatial technologies easier to use, understand, and trust.\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/author\/sbattersby\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Three Ways to Improve Geocoding Performance","description":"Learn how to optimize your Apache Spark cluster to improve geocoding performance.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","og_locale":"en_US","og_type":"article","og_title":"Three ways to improve geocoding performance","og_description":"Learn how to optimize your Apache Spark cluster to improve geocoding performance.","og_url":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","og_site_name":"ArcGIS Blog","article_publisher":"https:\/\/www.facebook.com\/esrigis\/","twitter_card":"summary_large_image","twitter_site":"@ESRI","twitter_misc":{"Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#article","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance"},"author":{"name":"Sarah Battersby","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/5a84e904a4a87576be52b7d2c7dd5615"},"headline":"Three ways to improve geocoding performance","datePublished":"2025-08-25T16:11:01+00:00","mainEntityOfPage":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance"},"wordCount":6,"commentCount":0,"publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"keywords":["Apache Spark","ArcGIS GeoAnalytics Engine","Geocoding"],"articleSection":["Analytics"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","url":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","name":"Three Ways to Improve Geocoding Performance","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#website"},"datePublished":"2025-08-25T16:11:01+00:00","description":"Learn how to optimize your Apache Spark cluster to improve geocoding performance.","breadcrumb":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.esri.com\/arcgis-blog\/"},{"@type":"ListItem","position":2,"name":"Three ways to improve geocoding performance"}]},{"@type":"WebSite","@id":"https:\/\/www.esri.com\/arcgis-blog\/#website","url":"https:\/\/www.esri.com\/arcgis-blog\/","name":"ArcGIS Blog","description":"Get insider info from Esri product teams","publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization","name":"Esri","url":"https:\/\/www.esri.com\/arcgis-blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","width":400,"height":400,"caption":"Esri"},"image":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/esrigis\/","https:\/\/x.com\/ESRI","https:\/\/www.linkedin.com\/company\/5311\/"]},{"@type":"Person","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/5a84e904a4a87576be52b7d2c7dd5615","name":"Sarah Battersby","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/07\/Sarah_Battersby-213x200.png","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/07\/Sarah_Battersby-213x200.png","caption":"Sarah Battersby"},"description":"Sarah is a Product Manager for ArcGIS GeoAnalytics Engine. She has a PhD in Geography \/ Cognitive Science from UC Santa Barbara, and enjoys finding ways to make spatial technologies easier to use, understand, and trust.","url":"https:\/\/www.esri.com\/arcgis-blog\/author\/sbattersby"}]}},"text_date":"August 25, 2025","author_name":"Multiple Authors","author_page":"https:\/\/www.esri.com\/arcgis-blog\/products\/geoanalytics-engine\/analytics\/three-ways-to-improve-your-geocoding-performance","custom_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/Newsroom-Keyart-Wide-1920-x-1080.jpg","primary_product":"ArcGIS GeoAnalytics Engine","tag_data":[{"term_id":586931,"name":"Apache Spark","slug":"apache-spark","term_group":0,"term_taxonomy_id":586931,"taxonomy":"post_tag","description":"","parent":0,"count":4,"filter":"raw"},{"term_id":765912,"name":"ArcGIS GeoAnalytics Engine","slug":"arcgis-geoanalytics-engine","term_group":0,"term_taxonomy_id":765912,"taxonomy":"post_tag","description":"","parent":0,"count":8,"filter":"raw"},{"term_id":25091,"name":"Geocoding","slug":"geocoding","term_group":0,"term_taxonomy_id":25091,"taxonomy":"post_tag","description":"","parent":0,"count":30,"filter":"raw"}],"category_data":[{"term_id":23341,"name":"Analytics","slug":"analytics","term_group":0,"term_taxonomy_id":23341,"taxonomy":"category","description":"","parent":0,"count":1370,"filter":"raw"}],"product_data":[{"term_id":765842,"name":"ArcGIS GeoAnalytics Engine","slug":"geoanalytics-engine","term_group":0,"term_taxonomy_id":765842,"taxonomy":"product","description":"","parent":36601,"count":24,"filter":"raw"}],"primary_product_link":"https:\/\/www.esri.com\/arcgis-blog\/?s=#&products=geoanalytics-engine","_links":{"self":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/2933391","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/users\/342532"}],"replies":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/comments?post=2933391"}],"version-history":[{"count":0,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/2933391\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/media?parent=2933391"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/categories?post=2933391"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/tags?post=2933391"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/industry?post=2933391"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/product?post=2933391"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}