ArcGIS Knowledge supports synchronization (sync) in ArcGIS Enterprise 12.1 to enable offline workflows, distributed analysis, and efficient data movement across ArcGIS Enterprise deployments. Check out this blog for more ArcGIS Knowledge features released with ArcGIS Enterprise 12.1. While many GIS analysts and administrators are already familiar with sync as an ArcGIS platform capability, its performance characteristics differ when applied to knowledge graphs.
Unlike traditional feature or table-based services, knowledge graph sync operates on entities, relationships, and graph metadata, which increases both analytical flexibility and synchronization cost.
Sync for knowledge graph services is designed to support the following:
- Incremental synchronization using change-based (delta) operations rather than full graph reloads
- Offline and low-connectivity workflows for Enterprise deployments
- Graph-centric analytics, in which the knowledge graph serves as a centralized analytical structure synchronized from external systems of record
The best practices described in this article can optimize knowledge graph service sync performance and streamline workflows.
Core performance principle: Small, incremental sync operations
Performance in knowledge graph sync is primarily driven by the amount of change included in each synchronization operation. Rather than invalidating and reloading full graph content, knowledge graph sync is designed to do the following tasks:
- Track changes since the last successful synchronization
- Transfer only modified entities, relationships, and associated metadata
- Reduce network transfer, serialization overhead, and client-side processing
Best practices include the following:
- Perform frequent, incremental synchronization operations.
- Avoid long gaps between syncs that accumulate large change sets.
- Treat change-based deltas—not full graphs—as the operational unit of sync.
Attribute filters, indexing, and delete-heavy workloads
Replica definitions may include attribute filters to define entities and relationships included in synchronization. These filters directly affect delta generation performance. This effect is especially pronounced for relationships, in which entity endpoints or relationships must still be evaluated against replica filters during synchronization.
Careful selection and indexing of attribute filters are essential for maintaining predictable sync performance in graphs with frequent updates or deletions.
Best practices include the following:
- Index attribute predicates used in replica definitions, where possible, to avoid full scans during delta computation.
- Carefully choose attribute filters when a large number of deletes have occurred since the last synchronization.
This guidance aligns with existing ArcGIS Enterprise sync principles but is more critical for graph workloads due to relationship traversal and metadata coupling.
Provenance: Contextual metadata and performance trade-offs
In a knowledge graph service, provenance refers to metadata that captures the source of graph data. Provenance may include information about source systems, data loading workflows, or contextual attributes that explain how an entity or relationship was derived.
Provenance is commonly used in investigative and analytical workflows where data trust, traceability, and auditability are required. Enabling provenance introduces additional metadata that can be stored alongside entities and relationships, tracked for change detection, and, optionally, included in replica definitions and synchronization payloads.
The key consideration is whether provenance is included in the replica definition. When provenance options are enabled for a replica, the knowledge graph service must generate and evaluate provenance deltas during the synchronize operation, which incurs additional processing cost.
Best practices include the following:
- Enable provenance only when lineage or source attribution is required.
- Avoid enabling provenance by default for large or performance-sensitive graphs.
- Plan for increased sync duration and payload size when provenance is included in the replica definition.
Connected documents: Expanding the graph surface area
Connected documents associate external or unstructured content, such as reports or documents, with entities in a knowledge graph. These associations are represented as graph relationships.
Connected documents provide valuable contextual information for analysts but expand the overall graph structure. From a synchronization perspective, connected documents are modeled as graph elements, not lightweight references, and their relationships must be evaluated and tracked during synchronization. As a result, including document relationships increases replica scope even when document content itself is unchanged.
Best practices include the following:
- Include connected documents only in replicas that explicitly require document context.
- Exclude document related relationships from replicas used for structural, spatial, or operational analysis.
- Separate document-heavy analytical graphs from performance-focused operational graphs.
Relationship tracking in replica definitions
In knowledge graphs, relationships are first-class objects. Each relationship type included in a replica definition contributes to change detection complexity, replica size, and synchronization run time. As a result, large or unbounded relationship sets can significantly increase synchronization cost, even when entity counts remain stable.
Best practices include the following:
- Limit the number of relationship types included in replica definitions.
- Include only relationships required for the intended analytical workflow.
- Avoid full-graph replicas unless they are necessary.
Knowledge graph sync performance summary
| Do | Don’t |
| Use frequent, incremental sync operations. | Accumulate large changes between syncs. |
| Scope replicas to required entities and relationships. | Include entire graphs by default. |
| Enable provenance only when required. | Enable provenance for all graphs. |
| Separate document-heavy graphs from operational graphs. | Always include connected documents. |
| Design replicas for specific analytic purposes. | Use one replica definition for all use cases. |
Knowledge graph sync performance is primarily driven by scope control and performs best with smaller, purpose-built replicas, limited relationship tracking, and intentional use of metadata such as provenance and connected documents.
Enterprise deployments benefit most when knowledge graph services are treated as analytical assets, synchronized intentionally rather than exhaustively.
Article Discussion: