If your ArcGIS systems deliver critical business capabilities, the speed at which information and capabilities are delivered to end-users isn’t merely a luxury, it’s a necessity. Organizations must minimize network latency to enhance their system’s performance, and the resulting efficiency of end users.
On-premises environments typically benefit from physically co-located infrastructure. However, as organizations continue to shift their systems to the cloud, they must take greater care to reduce network latency. To ensure their ArcGIS systems remain not only functional but also highly responsive when leveraging cloud environments, organizations must employ thoughtful design choices.
Understanding network latency and its impact
Network latency refers to the time it takes information to traverse a network. High network latency tends to produce longer delays for end-users, while low latency networks offer a more responsive experience. Therefore, to support productivity and achieve the desired operational outcomes, organizations must reduce network latency where possible.
Several factors influence network latency, including:
- Distance – how far a packet of information must travel
- Transmission – time to push bits onto the wire, depending on bandwidth and packet size
- Network architecture – number of hops and routing efficiency
- Processing delays – time it takes routers / switches to inspect and forward packets
Network latency can occur at any point in a system where distributed components exchange data. Like other high-volume transactional business systems, ArcGIS systems are significantly impacted by database latency. This is especially true during long transactions, like when editing geospatial data. As a result, organizations must deploy these system components in a way that reduces database latency. In the cloud, this means as close together and with the fewest hops possible.
How to design for low network latency in the cloud
1. Minimize distance network traffic travels:
Though the power of geography is great, it does not defy the fundamental speed limit of the universe 😉. Data cannot be exchanged instantaneously, so keeping system components close together reduces delays.
- Deploying resources in the same region and availability zone(s) means that they will be physically close together, often in the same data center, which reduces latency by reducing the distance data needs to travel.
2. Reduce number of network hops:
No one wants their much-anticipated packages rerouted through countless sorting facilities before reaching their doorstep. Likewise, your data should avoid unnecessary detours through multiple routers or switches.
- Subnetting is a method of grouping network endpoints that frequently communicate with each other. A subnet acts as a network inside a network to minimize unnecessary router hops and reduce network latency.
- VPC Peering is a low-latency, high-bandwidth connection between resources in different virtual networks. This way, resources in either VPC (virtual private cloud) can connect to resources in the peered virtual network.
- When you need to connect to on-premises resources for regulatory or other requirements, consider using cloud services like AWS Direct Connect, Azure ExpressRoute, GCP Interconnect to reduce latency.
➡️ Deploy all ArcGIS Enterprise components , desktops, and enterprise geodatabase(s) in the same region where possible, understanding that cross-region connectivity may introduce latency that will impact workflows
➡️ Consider using VPC Peering when you can’t deploy resources that need to communicate with each other within the same network.
Quantifying the impact
We can take this a step further, and evaluate how much network latency can actually impact a real ArcGIS system. Network latency can occur at multiple tiers of an ArcGIS system, including:
- From the client to the application tier (ArcGIS Enterprise), especially when a VPN is involved
- Anywhere ArcGIS Enterprise components are separated
However, our testing specifically focused on measuring latency between ArcGIS Server and the enterprise geodatabase. This tier is a critical point of interaction where latency can directly affect editing performance, service responsiveness, and overall user experience. So, let’s review test results that quantify how significantly these design decisions affected an ArcGIS system’s behavior and user experience. In this case, how two different high-level designs behave with the same data, workflows and load:
- Recommended, low network latency design:
- Same region, same VPC
- Database latency around 1 – 5 ms
2. High network latency design:
- Deployed across two availability zones
- Two VPCs with VPC Peering
- Measured Round-Trip Time (RTT) using Microsoft’s psping utility latency, showing 13+ ms database latency throughout the test
Test Results
We can evaluate the impacts of increased network latency across three lenses:
- Overall system resource utilization – how taxed each system component is
- ArcSOC usage – how busy service instances are
- User responsiveness – how long it takes users to perform their workflows
Ok, let’s dive in!
1. System resource utilization
In the graphs below, you can see the resource utilization across all system tiers for two test scenarios – the recommended, low database latency design (left) and the high database latency design (right). The dark orange lines represent CPU utilization, which is low across all servers in both tests. The significant difference between these two tests is the blue line that represents concurrent editing requests:
- When there is low database latency, requests are opening and closing steadily.
- In the high database latency scenario, open editing requests ramp up continually until the load begins to drop.
Normally we see end-user responsiveness slow down as a result of high resource utilization. But in this case, the system’s resources are not taxed heavily because requests aren’t closing. This can be traced to the increased network latency, which caused each request to take longer to complete.
As a result, the test ran 31 minutes beyond the scheduled end, with several editing workflows still in progress. In this case, the additional database latency resulted in a loss of 25 minutes of efficiency as compared to the low-latency design.
2. ArcSOC usage
An ArcSOC is a server process that runs a single service instance to handle requests for a published service from users. The low-latency results (left) shows ArcSOC usage to be relatively low, with most busy instances on the hosting servers at 6 or below and most busy instances on the UN servers at 9 or below. However, the high-latency results (right) paint a different picture. Here, the UN server ArcSOCs show maximum utilization throughout most of the test. We know from the results abolve that server resources are not being taxed. Therefore, we can attribute the busy instances to ArcSOCs waiting for responses from the database.
3. User responsiveness & workflow duration
The low-latency design (top chart) shows workflow times typical for our documented gas network information management system. The high-latency design (bottom) shows greatly increased workflow times, and worse end-user efficiency. Since the only difference is that the database is hosted in a different region, we can attribute the loss of user effiency in this case to the additional 13+ ms of database latency.
Scaling ArcSOC instances to offset network latency impact
From the test results we just looked at, we can see that ArcSOCs were saturated while the system was generally underutilized. This made us wonder, if our system can support additional service instances, might that improve the perceived latency of end-users?
To test this, we doubled the number of ArcSOCs, leaving the instance resources the same. The chart below shows the resulting workflow times under three scenarios:
- low database latency
- high database latency with a 1:1 ArcSOC to vCPU ratio (our original high-latency configuration we evaluated previously)
- high database latency with a 2:1 ArcSOC to vCPU ratio (doubling the number of ArcSOCs while keeping the instance resources the same)
We’ve already seen in the previous user responsiveness tests that the high-latency editing workflows took at least twice as long as the low-latency workflows. However, you can see that when we doubled the number of map instances with the high-latency design, those times improved significantly – by an average of 45%. This is because the number of available ArcSOCs is increased, which leads to more processed requests. However, they are still waiting for a response from the database (resulting from high network latency), and so workflow execution times are still significantly longer than the low-latency configuration.
So, if you’re in a situation where you can’t feasibly reduce database latency, you can try scaling ArcSOC instances to help offset some of the negative impacts. This potential workaround requires proper system monitoring to verify there are enough server resources to handle the increase in ArcSOCs. Further, this will only help increase perceived performance and user efficiency if your service instances are saturated (if they are all in use). While not exactly solving the latency problem, this test does highlight the value of system monitoring and map service instance tuning to improve performance
Key takeaways
Now that you understand more about network latency and its impact on ArcGIS systems, you can take steps to reduce it through improved design, observation, and tuning to get better results for your organization.
- In our tests, workflow execution times increased between 100% – 550% with the high database latency design compared to the low-latency system.
- This translates to significant waste of staff time and opportunity costs to the organization.
- Monitor your network performance, including its impact on end-user experience.
- Review your network configuration and minimize components that are introducing latency. Firewalls, routers, and other network features can impact round-trip time. Verify that network components aren’t interfering with your bandwidth.
- Keep your ArcGIS system components as close together as possible – at least in the same region.
- When it’s not feasible to reduce network latency, evaluate whether tuning your services configuration might improve perceived performance.
➡️ For more recommendations on optimizing ArcGIS systems, check out the blogs below!
Tackling Network Latency at the Client Tier of ArcGIS Systems
Seven Ways You Should Be Using Test Studies to Support Your ArcGIS Systems
Article Discussion: