In today’s continuously active environment, organizations rely on ArcGIS Enterprise on Kubernetes to power critical, public-facing applications that can experience sudden and unpredictable traffic surges. As administrators, our mission is clear: keep the system highly available, scalable, and performant, no matter what the workload throws at us.
In this year’s Developer and Technology Summit plenary, Andrew Sakowicz demonstrates how to design, load-test, and monitor a scalable system. He also gives a sneak peek at a new capability: Intelligent Sizing Advisor.
Scalability
Andrew showcases a fire warning application. Traffic for this application can spike dramatically or even virally, so during these moments, the system must scale rapidly to maintain performance.
The key to doing this effectively lies in two components:
- Pods – the smallest deployable compute units
- Nodes – the actual machines hosting those pods
Andrew opens ArcGIS Manager and accesses the fire warning service’s settings. He enables Auto-scaling, which allows the Horizontal Pod Autoscaler (HPA) to add more pod replicas from the minimum number of 2, for maintaining standby capacity, up to the maximum number of 50, when the average CPU usage reaches 50%.
An administrator can configure the system to deploy up to 100 replicas of a service, each requesting at least 1 CPU core. In Andrew’s case, this translates to around 100 cores, requiring approximately 13 additional nodes. Therefore, the system must also be able to timely scale the number of nodes.
To do this, Andrew uses Karpenter, the high-performance autoscaler built for Kubernetes. In addition to quick scaling, Karpenter offers intelligent provisioning and supports creating workload-specific, dedicated node pools for services with volatile traffic patterns. By isolating these workloads:
- Resource contention is minimized
- The rest of the cluster remains unaffected
- Performance stays predictable
After defining the pool, Andrew assigns the high-traffic service to that pool directly in ArcGIS Manager.
Observability
To confirm that the auto-scaling is working as intended and not negatively impacting their system, however, administrators need to be able to gather the relevant data and observe changes as they happen. For this purpose, Andrew creates an observability dashboard powered by:
- ArcGIS Enterprise metrics, integrated alongside Kubernetes cluster utilization metrics
- Prometheus for metrics storage
- Grafana for visualization
In this observability dashboard, Andrew can monitor how the system behaves under increasing demand, gaining clear insight into each stage of scaling and performance stabilization. Key aspects to monitor include:
- Load: As load increases, request throughput rises accordingly.
- Pod Scaling: Pods scale out rapidly as the Horizontal Pod Autoscaler responds to higher CPU usage.
- Node Scaling: Once existing nodes reach capacity, Karpenter provisions additional nodes within the dedicated pool.
- Performance: Performance stabilizes quickly, even when the system is under heavy load.
- Reliability and Errors: No errors appear during stress testing, confirming the system remains both stable and reliable.
Sneak Peek: Intelligent Sizing Advisor
Scaling a single service is one thing, but many organizations operate hundreds or thousands of services with unknown or inconsistent load patterns. Manually sizing each one is time-consuming and, at scale, nearly impossible.
For these organizations, Andrew previews the new Intelligent Sizing Advisor capability that continuously analyzes operational metrics across the deployment and automatically highlights where attention is needed.
The Advisor can identify:
- Severe issues, such as memory exhaustion or CPU throttling
- Warnings, like extended wait times indicating a need to scale up
- Information-level insights, such as unused replicas or minor adjustments
Administrators can choose to apply recommendations manually or allow the Advisor to apply them automatically, dramatically reducing operational burden.
Conclusion
In his demonstration, Andrew shows how ArcGIS Enterprise on Kubernetes can:
- Scale intelligently in response to sudden, viral workloads
- Maintain stable performance under pressure
- Provide full visibility through observability tools
- Reduce administrative overhead with upcoming intelligent sizing capabilities
Through smart application of the scalability and observability tools ArcGIS Enterprise on Kubernetes makes available, you can run a resilient, high-performance system that scales exactly when and how you need it.
Commenting is not enabled for this article.