The geodatabase is the native data format for ArcGIS. It is a data storage container that defines how data is stored, accessed, and managed by ArcGIS. The term geodatabase combines geo (spatial data) with database (specifically a relational database management system or RDBMS). ArcGIS 9.2 has three types of geodatabases: Microsoft Access-based personal geodatabases, file geodatabases, and ArcSDE geodatabases.
Personal and file geodatabases are designed for single users and small projects. ArcSDE geodatabases are scalable and designed for larger-scale use, ranging from medium to enterprise-wide implementations. These geodatabases require ArcSDE technology and are available at three levels (in ascending order of capacity and functionality): personal geodatabase (ArcSDE Personal), workgroup geodatabase (ArcSDE Workgroup), and enterprise geodatabase (ArcSDE Enterprise). This article deals with ArcSDE enterprise geodatabases.
Understanding Enterprise Geodatabase Architecture
At a conceptual level, an enterprise geodatabase consists of a multitier architecture that implements advanced logic and behavior in the application tier (e.g., ArcGIS software) on top of a data storage tier (e.g., RDBMS software). The application tier can be further subdivided into two partsArcObjects and ArcSDE technology. The responsibility for managing geographic data in an enterprise geodatabase is shared between ArcGIS and whichever RDBMS is used.
On the data storage tier, RDBMS software provides a simple, formal data model for storing and managing information in tables. The schema of an enterprise geodatabase is persisted in the RDBMS as a collection of tables known as the ArcSDE Repository. Aspects related to data storage and retrieval are implemented as simple tables and certain aspects of geographic data management, such as disk-based storage, definition of attribute types, query processing, and multiuser transaction processing, are executed by the RDBMS. IBM DB2, IBM Informix, Oracle, and Microsoft SQL Server platforms are currently supported by ArcGIS. At version 9.3, PostgreSQL will be supported.
ArcSDE technology forms the middle tier. Prior to ArcGIS 9.2, ArcSDE was a separate software product. At ArcGIS 9.2, ArcSDE was integrated into both ArcGIS Desktop and ArcGIS Server and is now formally known as ArcSDE technology. As the gateway between GIS clients and an RDBMS, ArcSDE serves spatial data and enables that data to be accessed and managed within an RDBMS. It is implemented as several componentsa directory of executables, a set of tables and stored procedures in the database (i.e., the ArcSDE Repository), and an optional service. These components will be discussed in more detail.
ArcSDE technology provides fundamental capabilities that include
The upper level of the application tier, ArcObjects, implements geodatabase application logic. This set of platform-independent software components is written in C++ and provides services to support GIS applications as thick clients on the desktop and thin clients on the server. This technology component is built into GIS clients (e.g., ArcGIS Desktop) and implements more complex object behavior and integrity constraints on simple features, such as points, lines, and polygons, stored in an RDBMS. In other words, ArcObjects implements behavior on the feature geometries. Feature classes, feature datasets, raster catalogs, topologies, networks, and terrains are all examples of geospatial data elements within the geodatabase data model for which ArcObjects provides the application logic that implements GIS behavior on top of simple spatial features stored in an RDBMS.
The three enterprise geodatabase architectural tiers are defined at a conceptual level. To most end users, working with the architectural tiers of the enterprise geodatabase is an easy, transparent process. GIS managers or database administrators most likely work directly with these tiers only during the setup and configuration of an enterprise geodatabase or when performing maintenance.
Enterprise Geodatabase Capabilities
Designed for large-scale systems, the enterprise geodatabase can be scaled to any size, support any number of users, and run on computers of any size and configuration. It takes full advantage of the underlying RDBMS architecture to provide high performance and support for extremely large continuous GIS datasets. RDBMS functionality supports GIS data management for scalability, reliability, security, backup, and data integrity. In addition to supporting many users with concurrent access to the same data, an enterprise geodatabase can be integrated with an organization's existing IT systems.
Some of the aspects of ArcSDE technology that contribute to these capabilities include the following.
VersioningWith versioning, the ArcSDE geodatabase can manage and maintain multiple states while preserving integrity in the database. Versioning is the default ArcSDE geodatabase editing environment that explicitly records states (i.e., versions) of individual features and objects as they are modified, added, and/or retired. It enables multiple users to access and edit the same data simultaneously and provides long transaction support. Simple queries are used to view and work with any desired state for a particular point in time or see an individual user's current edits.
Using nonversioned editing is equivalent to a standard database transaction. The transaction is performed within the scope of an ArcMap edit session and the data source is directly edited. Nonversioned edit sessions do not store changes in other tables as versioned edit sessions do.
With geodatabase replication, data is distributed across two or more geodatabases in a manner that allows them to synchronize any data changes that are made. It is built on top of the versioning environment and supports the full geodatabase data model including topologies and geometric networks. In this asynchronous model, the replication is loosely coupled. This means each replicated geodatabase can work independently and still synchronize changes with other replicated geodatabases.
Because geodatabase replication is implemented at the ArcObjects and ArcSDE technology tiers, the RDBMSs involved can be different. Geodatabase replication can be used in connected and disconnected environments and can also work with local geodatabase connections as well as geodataserver objects (through ArcGIS Server), which enables access to a geodatabase over the Internet.
When enabled for a dataset, historical archiving captures all data changes in the DEFAULT version of the enterprise geodatabase by preserving the transactional history as an additional archive class. ArcGIS applies transaction time when changes are saved or posted to the DEFAULT version to record the moment of change to the database.
Installing an Enterprise Geodatabase
The setup and configuration of a typical enterprise geodatabase has two stages. In the first stage, enterprise ArcSDE software is installed on the server. The second stage (i.e., ArcSDE postinstallation) consists of four steps.
On Windows operating systems, the ArcSDE postinstallation can be performed using a wizard. Alternatively, it can also be performed manually with ArcSDE commands. Additional parameters may also need to be specified during the postinstallation, depending on the RDBMS and operating system used. After the enterprise geodatabase has been created, database management tools can be used to create users, schemas, and indexes to customize the enterprise geodatabase.
Enterprise Geodatabase Components
A typical enterprise geodatabase installation has three main componentsthe ArcSDE home directory, the ArcSDE Respository, and the ArcSDE service.
The ArcSDE Home Directory
When the ArcSDE component of ArcGIS Server is installed on the server, this directory is created. It is referenced in the server operating system by an environment variable named %SDEHOME%. The directory contains the ArcSDE command line executables, ArcSDE configuration files, geocoding and language support files, log files (for troubleshooting ArcSDE server issues), help documentation, and some sample utilities.
The ArcSDE command line executables are a collection of binary files that can be run at the command prompt by geodatabase administrators to create, configure, manage, and monitor both the enterprise geodatabase and ArcSDE service. ArcSDE command line executables include a set of commands for data import and export at the ArcSDE technology tier of the enterprise geodatabase.
The ArcSDE Repository
The internal system tables and stored procedures that are installed in the RDBMS during the ArcSDE postinstallation are owned and managed by the geodatabase administrative user created in the first step of the ArcSDE postinstallation. They are self-managed internally by both ArcGIS and the RDBMS via stored procedures and should not be edited manually.
ArcSDE Repository tables can be subdivided into ArcSDE system tables and geodatabase system tables (i.e., system tables prefixed with GDB_). ArcSDE system tables work at the ArcSDE technology tier level and contain basic metadata for ArcSDE, store feature geometry and raster data, and manage the versioning environment. The geodatabase system tables work at the ArcObjects tier level and store information on geodatabase behavior and functionality for topologies, networks, and domains. These two groups form the schema of the enterprise geodatabase.
Enterprise geodatabase administrators should be familiar with the key ArcSDE Repository tables listed in Figure 2.
The ArcSDE Service
Also commonly called the giomgr process (an abbreviation for geographic input/output manager), the ArcSDE service is a persistent service on the ArcSDE server that is dependent on the RDBMS instance. The giomgr process supports application server connections to the enterprise geodatabase.
The ArcSDE service listens for incoming client connection requests on a dedicated port and helps enable clients to connect to the geodatabase. A typical enterprise geodatabase installation has one associated ArcSDE service; however, the ArcSDE service is not required if only direct connections are made to the enterprise geodatabase.
Type of Client Connections
Clients typically communicate with an enterprise geodatabase over a network using TCP/IP protocols and can connect to an enterprise geodatabase in two waysusing an application server connection or a direct connection.
Application Server Connection
This traditional client-connect method involves the ArcSDE service, which listens for client connection requests. When a client application, such as ArcGIS Desktop, requests a connection to the enterprise geodatabase, a gsrvr (an abbreviation for geographic server) process is launched by the ArcSDE service and provides a dedicated link between the client and the geodatabase. The ArcSDE service continues to listen for connection requests.
The connection to the geodatabase is based on the user name and password submitted. Dataset access depends on the permissions established for the user by the geodatabase administrator. The gsrvr process remains connected to the geodatabase until the client releases the connection by closing the application. This connection method is commonly called a three-tier connection because it involves the client application, the geodatabase, and the giomgr and gsrvr processes. In this method, most of the work is performed on the server.
With this method, clients connect directly to the enterprise geodatabase without using the ArcSDE service. Communication between the clients and geodatabase occurs via ArcSDE direct-connect drivers, located on the client side, not through the ArcSDE service. Client machines must be configured for network access.
ArcSDE direct-connect drivers are automatically installed for the whole ArcGIS product suite, the ArcView 3.x Database Access extension, ArcIMS, ArcInfo Workstation, and MapObjects 2. For custom applications built from the ArcSDE C API, the ArcSDE direct-connect drivers need to be enabled with the application to support this functionality.
Direct connection drivers are built from the same software code used to build the ArcSDE service. The difference is that direct connect drivers are built as dynamic-link library files and execute in the process space of the client application, whereas the ArcSDE service was built as an executable program that runs on the ArcSDE server.
With this connection method, commonly called a two-tier connection because it only involves the client application and the geodatabase, some of the work that would have occurred on the server with the application server connection is performed on the client.
To have ArcSDE server handle the majority of the ArcSDE processing load, use application server connections. When the client machines have enough resources to handle some of the ArcSDE processing load, use direct connections. Direct connections may cause more network traffic. Both client connection methods can be supported for the same enterprise geodatabase in any combination and configuration.
The enterprise geodatabase is the foundation for building a large-scale GIS with ArcGIS Server Enterprise. It uses a combination of ArcObjects, ArcSDE technology, and RDBMS software to define how data is stored, accessed, and managed by ArcGIS. Conceptually, it stores GIS data in a centralized location. However, it can be set up and configured for a variety of implementations.
The enterprise geodatabase can be used for applying sophisticated business rules and relationships to spatial data, defining advanced georelational models such as topologies and networks, and providing a multiuser access and editing environment. With these capabilities, the enterprise geodatabase spatial data can be leveraged to its full potential while maintaining a consistent, accurate GIS database.