The ability to consume data from cloud data warehouses is something many of you have been asking for. And we have heard you! Last November we released ArcGIS Pro 2.9 and ArcGIS Enterprise 10.9.1, both of which include support for cloud data warehouses.
You may have read about this earlier in Melissa and Diana’s Introducing Cloud Data Warehouse Support blog. We will build upon the topics they covered by taking a more in-depth look at how cloud data warehouses are now supported with ArcGIS Enterprise specifically.
What is a cloud data warehouse?
Cloud data warehouses are databases designed to handle large volumes of structured data that are accessible in public clouds as managed services. They are being used in more and more organizations in across many different industries. Advantages of using cloud data warehouses include:
- Low total cost of ownership
- Improved speed and performance
- Improved data access and integration
- Scalability and elasticity
Prior to this release, you could previously connect to cloud data warehouse data through ArcGIS Insights and ArcGIS Data Interoperability. With these latest ArcGIS 2021 Q4 releases, there is now support in ArcGIS Pro and ArcGIS Enterprise for three different cloud data warehouses: Google BigQuery, Snowflake, and Amazon Redshift.
Cloud data warehouses and ArcGIS Pro
In case you missed what’s new in ArcGIS Pro 2.9, a connection can now be made directly to the cloud data warehouse of your choice. This means you can now take advantage of all the benefits of a cloud data warehouse, with the ability to access and visualize the data on a map and perform analysis. The data can be consumed directly from the cloud data warehouse, as a query layer, or with feature binning enabled.
Cloud data warehouses and ArcGIS Enterprise
Once you have your cloud data warehouse data configured how you want it in ArcGIS Pro, you can then share it as a map image layer to ArcGIS Enterprise. If a query layer was created in ArcGIS Pro, that query layer will be present in the map image layer in ArcGIS Enterprise. The same goes for feature binning; if feature binning was enabled on the data in ArcGIS Pro, feature binning will be present in the map image layer in ArcGIS Enterprise.
Prerequisites for sharing cloud data warehouse data
Cloud data warehouse data can be shared to ArcGIS Enterprise on Windows, Linux, and Kubernetes. Before sharing your data, the cloud data warehouse must be registered with ArcGIS Enterprise’s hosting server as a registered data store. Cloud data warehouses can only be registered with the hosting server; they can’t be registered with any other ArcGIS Server federated with the Enterprise portal.
When it comes to registering the cloud data warehouse, it isn’t any different than any other type of database connection. For example, the workflow for registering BigQuery is just the same as it would be for SQL Server. You obtain a .sde connection (created through ArcGIS Pro) and register the data store through ArcGIS Pro or the ArcGIS Enterprise portal. Read more about this workflow in the User-managed data stores in ArcGIS Enterprise section of the product documentation.
Publishing your cloud data warehouse data
Once registered, the cloud data warehouse data can be published to the hosting server as a map image layer. When sharing the map image layer from ArcGIS Pro, you will see that the service can be published in one of three ways:
- Accessing data directly
- Materialized view
Now let’s get into what these three options mean exactly.
Accessing data directly
Sharing cloud warehouse data by accessing data directly means that the map image layer published will be directly accessing the cloud data warehouse. This can be costly because every request made to the service will include an associated request to the cloud data warehouse. While this workflow may work for small datasets, it is not recommended unless you have a specific workflow that requires this.
Sharing cloud warehouse data via snapshot means that when the map image layer is published, the data is copied into the ArcGIS Data Store’s relational data store. Now just because this service is using the ArcGIS Data Store doesn’t mean the map image layer is a hosted service. It’s a referenced map service, no different from a map service that is published from a feature class in a folder or enterprise geodatabase. The only difference is that instead of referencing data in a folder or enterprise geodatabase, this specific service is referencing data in the ArcGIS Data Store.
Using a snapshot is more cost-effective than accessing the data directly. Instead of every request made to the service having an associated request to the cloud data warehouse, every request made to the service will have an associated request to the ArcGIS Data Store. Because requests are made to the ArcGIS Data Store, there may be better rendering performance.
Any updates made to the cloud data warehouse data will not be reflected automatically in the map image layer. For updates to be reflected in the image layer, the data must be manually updated. This can be done on demand through the Enterprise portal.
Sharing cloud data warehouse data through a materialized view means the data will be accessed directly in the cloud data warehouse but, unlike accessing the data directly, it will only be updated periodically. The exact interval for this will depend on how the query layer was configured in ArcGIS Pro.
Similar to a snapshot, using a materialized view is more cost-effective than accessing the data directly. Instead of every request made to the service having an associated request to the cloud data warehouse, every request made to the service will have an associated request to a cached query in the cloud data warehouse.
Have you started using cloud data warehouses in your ArcGIS Enterprise workflows yet? Let us know if you have any feedback or questions below – we’d love to hear from you!