ArcGIS Blog

Mar 12, 2026

Lessons in Observability from Parcel Management System Testing

By Sarah Scher and Raymond Bunn

Our latest Parcel Management System test study provides a concrete example of how an ArcGIS system’s observability – across virtual machines, services, and workflows – enables teams to tune performance, right‑size infrastructure, and make informed architectural decisions before problems appear in production.

In this blog, you’ll learn:

Why observability must be designed into ArcGIS systems from the start
What we monitor in our test studies to get insightful results
How to evaluate test results and translate them into actionable improvements

And if you’re looking to really dive in, check out the full test study here!

Observability by design

Observability is often treated as something you “add later” to help troubleshoot issues in production. However in practice, observability should be factored in at the design phase. Ultimately, it determines to what extent you will be able to tune, scale, prevent issues, and troubleshoot problems for your systems.

In this test study, we intentionally built observability into the system at the start, so we could monitor different aspects of the systems as workflows were executed under increasing load. We aimed not just to confirm that workflows were completed successfully, but to understand how each tier of the system behaved as demand increased. This was especially true of the user experience at the client tier, which can be a challenge to monitor.

Learn more about observability in the ArcGIS Architecture Center!

Cross-system monitoring

To perform successful system validation and deliver meaningful results, we needed to be able to monitor the system’s behavior and capture telemetry.

We used ArcGIS Monitor and enterprise IT monitoring tools like Windows Performance Monitor to monitor the system’s performance and capture telemetry on its behavior under certain conditions, like:

1. Machine-level metrics

At the infrastructure layer, we monitored system resources on each virtual machine hosting ArcGIS Enterprise components and our enterprise geodatabase, like:

CPU utilization
Memory usage
Disk I/O
Network activity

This level of visibility establishes a baseline understanding of whether our virtual machines are:

right-sized
over-utilized and at risk of contention, or
under-utilized and potentially overprovisioned

In the graphic below, you can see these machine-level metrics across each tier of the system:

2. Services and ArcSOC utilization

At the services tier, we monitored:

ArcSOC usage on both the hosting server site and the GIS (parcel) server site using Soccer (ArcSOC Scanner)
service instance patterns under load

This layer of observability is critical for understanding whether:

service instances are well-configured
resources are evenly distributed
bottlenecks are emeging at the service level rather than the infrastructure level

In the graphic below, you can see our service instance utilization metrics- in particular, how often ArcSOCs were busy across the test duration for both our hosting server (left) and parcel server (right).

3. Workflow execution and completion times

While often skipped, you should monitor workflow execution and user experience. If you think about it, the ultimate reason we monitor all the other aspects of the system is to verify our users can complete their work in a timely fashion. Not to mention, we want them to enjoy their experience using GIS capabilities!

In this test study, workflows were executed according to a pacing model that simulates how work is performed in a real parcel management organization- measured in operations per hour rather than number of users.

By monitoring workflow completion times alongside system metrics, we were able to correlate:

infrastructure behavior
service utilization
workflow execution times

against a baseline to make an overall assessment of whether the users are having a responsive experience and able to do their work efficiently.

In the chart below, you can see a summarization of our workflow execution time across all of our tested workflows at different load scenarios (design load, 4x, and 8x design load). If workflow times increase as the load increases, this would tell us that the increased requests are likely straining the system and creating longer service wait times than expected. In this case, there is a negligeable change in workflow times as the load increases.

Evaluating the results: what our telemetry revealed

For all of the effort spent designing for observability to pay off, you have to be ready to act on what the system tells you. With telemetry in place across all tiers, we focused on three questions while reviewing the telemetry data:

Where did system behavior diverge from our expectations?
Have we distributed resources optimally?
Which system components will start to strain under load first?

Opportunities for cost optimization

Because system performance remained stable while some resources were underutilized, the results indicate opportunities to scale down infrastructure without negatively impacting performance or user experience.

This is a critical point: observability is an investment not only to support operational activities like troubleshooting, but it also enables cost‑conscious architecture decisions. By understanding which components are under pressure and which are not, teams can right‑size their systems with confidence.

Identifying system bottlenecks

While the system’s telemetry data helps us identify where system components might be over-sized, it also helps us identify the system bottleneck. Every system has a bottleneck, even if it isn’t impacting the system’s performance (yet).

From our test results, we can see that our database is our bottleneck as we reached 8x the design load. In the graphic below, you will notice that it is the system component that is showing the highest utilization, and therefore would likely be the first to reach its performance threshold.

In a real-world scenario, this is helpful to know because it let’s us:

be more proactive in either reallocating system resources, or
know to monitor the database utilization more closely in case the system starts to reach that threshold

Reconfiguring services for better alignment

Monitoring and telemetry capture at the services level helps identify where there might be opportunities to reconfigure ArcSOC distribution to better align with actual workflow demand. This is helpful to identify when a service instance reconfiguration might be warranted to distribute resources more optimally. Adjusting how service instances are allocated (rather than simply adding more infrastructure) can improve efficiency and resiliency while keeping costs in check.

For example, at 8x our design load, our test results show our read-only services on the hosting server (left) start to reach maximum utilization (16 active ArcSOCs), whereas our parcel server continues to show moderate ArcSOC utilization (roughly 8-10 active ArcSOCs out of the configured 16). Having this visibilitiy gives us the information we need to re-configure the distribution of service instances to provide us more buffer before users would experience longer service wait times.

Closing thoughts

In addition to providing a guide for implementing a Parcel Management System, this test study also demonstrates how observability enables intentional decision‑making for ArcGIS systems.

By monitoring:

Machine‑level resources
Service‑level behavior
Workflow‑level outcomes

teams gain the insight needed to:

Right‑size infrastructure
Optimize costs
Reconfigure services
Validate design assumptions
Reduce risk before production deployment

Just as importantly, these practices apply beyond testing environments like ours. In production, ongoing observability supports day‑to‑day operations, safer upgrades, faster troubleshooting, and informed long‑term system evolution.

Rather than reacting to issues after they surface, you can be well-equipped to anticipate constraints, understand system behavior, and make confident decisions as systems grow and change.

Do you have ideas for how we can improve our resources in the future? Please share your thoughts with us!

➡️ You can also find our full catalog of test studies and blogs here

➡️ If you have questions or want to keep the conversation going, consider joining our LinkedIn group

Sarah Scher

Sarah is an Architect at Esri focused on systems innovation . She's passionate about making tech topics easier to understand and apply, helping organizations advance their business goals by strategically leveraging Geography and GIS. In her free time, she likes to play map-based strategy games, go on adventures with her dog Willow, and prepare for her Starfleet entrance exams.

Raymond Bunn

Ray is an Architect with the Technology and Innovation Team. He leads an initiative to design and test ArcGIS systems with a focus on user experience.