The Geodatabase: Modeling and Managing Spatial Data
The geodatabase (GDB) is the common data storage and management framework for ArcGIS. Simply put, it is a container for spatial and attribute data. The geodatabase has been the primary data model for ArcGIS since the 8.0 release. The name combines geo (referring to spatial) with databasespecifically, a relational database management system (RDBMS). The term promotes the idea of having all GIS data stored uniformly in a central location for easy access and management.
The geodatabase supports all the different types of GIS data that can be used by ArcGIS, such as attribute data, geographic features, satellite and aerial images (raster data), CAD data, surface modeling or 3D data, utility and transportation network systems, GPS coordinates, and survey measurements. ArcGIS has a comprehensive suite of data conversion tools to easily migrate existing data into the geodatabase. By storing GIS data within a geodatabase, users can take advantage of its superior data management capabilities to leverage spatial information. This, in turn, can enhance and expand business and GIS application workflows.
The geodatabase is a more robust and extendable data model compared to shapefiles and coverages. While shapefiles and coverages are outstanding GIS data storage formats, they do not take advantage of the latest data storage technologies. The geodatabase is designed to make full use of the capabilities of ArcGIS Desktop and ArcGIS Server. The geodatabase is not just another spatial data format that can be used by ArcGIS; it is an integral part of the ArcGIS system.
GIS Data Storage
Vector data is stored in the geodatabase as thematic layers called feature classes. A feature class is a collection of geographic features with the same geometry type, such as a point, line, or polygon; the same attributes; and the same coordinate system. Feature classes can be grouped together within a feature dataseta collection of feature classesto model geospatial relationships between them. Raster data is stored as raster datasets; each raster image is stored as its own thematic layer. Multiple rasters can be grouped into a raster catalog (a collection of raster data), or if the rasters are adjacent to each other, they can be mosaicked into a single raster dataset. Table 1 contains a list of all the different types of GIS data that can be stored in the geodatabase.
Modeling Geospatial Relationships
Storing GIS data in the geodatabase enables users to take advantage of its advanced data modeling properties. Complex business logic can be applied to GIS data to create more detailed and accurate spatial data models that represent real-world GIS application workflows. Examples include land parcel management; natural resources management; river and stream system modeling; utility network system modeling, such as gas, water, and sewage pipelines; and three-dimensional surface modeling of the landscape.
By storing feature classes within a feature dataset, geospatial relationships can be modeled between the feature classes, enabling more advanced GIS analysis. The more common types of geospatial relationship data structures in the geodatabase are
Additional business logic in the geodatabase, in the form of subtypes and attribute domains, can also be applied to GIS data. Subtypes enable categorization of data in a table or feature class. For example, the streets in a streets feature class could be categorized into three subtypes: local streets, collector streets, and arterial streets. Attribute domains are rules that describe the legal values of a field. Whenever a domain is associated with an attribute field, only the values defined by the domain are valid for the field. In other words, the field will not accept a value that is not in that domain. For example, a domain that specifies that values 10 to 50 (meters) are valid is applied to a field describing survey length measurements. Any value that lies outside the range defined by the domain is invalid for the field and would not be allowed. Both subtypes and attribute domains can be easily customized to meet the requirements of a user's specific business and GIS application workflows.
Collectively, these examples of business logic in the geodatabase help streamline data entry and ensure the integrity of a user's GIS data. Therefore, the geodatabase enables users to leverage and optimize their GIS data to its full potential and helps maintain a consistent, accurate repository of GIS data.
Types of Geodatabases
The geodatabase is designed to support both the individual GIS user and organizations of various types and sizes. Just like the ArcGIS system, the geodatabase architecture has been engineered to easily scale to meet the changing needs and requirements of diverse organizations. A user can start with a file geodatabase for an individual project and upgrade to a larger workgroup or enterprise geodatabase as the volume of GIS data increases or the project scope expands.
There are two main classes of geodatabase: multiuser and single user. As the names suggest, multiuser geodatabases are meant for medium to large organizations, while single-user geodatabases are intended for individual users.
Multiuser geodatabases use ArcSDE technology and are implemented on an RDBMS platform. [Note: Prior to ArcGIS 9.2, ArcSDE was a stand-alone software product. At the ArcGIS 9.2 release, ArcSDE was integrated into both ArcGIS Desktop and ArcGIS Server. ArcSDE technology manages spatial data in an RDBMS and enables it to be accessed by ArcGIS clients.] Supported RDBMS platforms include DB2, Informix, Oracle, PostgreSQL, and SQL Server. Multiuser geodatabases leverage the underlying RDBMS architecture to provide better data security, such as access permission control for individual datasets, distributed file management, backup/recovery capabilities, and data integrity. ArcSDE technology provides additional geodatabase functionality that is not available in single-user geodatabases. This includes
There are three types of multiuser geodatabase: enterprise, workgroup, and desktop. The storage capacity and number of possible concurrent users vary with each type.
The single-user geodatabase class has two typesthe file geodatabase and the Microsoft Access personal geodatabase. Both types of geodatabase are intended for an individual GIS user, and both are available with all license levels of ArcGIS Desktop.
File GeodatabaseThis is implemented as a collection of binary files in a file system. It has no size capacity limit. By default, each table can store up to 1 terabyte of data. However, this can be changed so that a table can store up to 256 terabytes, if desired. Vector data stored within the file geodatabase can optionally be compressed into a read-only format, reducing the memory footprint and improving performance. Users can uncompress the vector data to make it editable at any time. It is also possible to have more than one editor in the file geodatabase at the same time, provided they are editing in different tables, feature classes, or feature datasets. The file geodatabase does not support versioning and geodatabase archiving. It can be used as a child geodatabase in both one-way and checkout/check-in geodatabase replication. Esri recommends that users who will be starting new GIS projects for their own local use should use file geodatabases over Microsoft Access personal geodatabases, because they offer more functionality and better performance.
Microsoft Access Personal GeodatabaseThis is implemented in a single Microsoft Access file and has a maximum size capacity of 2 gigabytes. It works for small GIS projects but does not support multiuser editing, versioning, or geodatabase archiving. Esri will continue to fully support Microsoft Access personal geodatabases for the foreseeable future.
The GIS data storage model is fully supported by all five geodatabase types. GIS datasets can be transferred between the various geodatabase types using the simple migration tools in ArcGIS Desktop, such as copy/paste and import/export.
The geodatabase is the primary data storage model for ArcGIS. It is a container of spatial and attribute data and enables the user to store many different types of GIS data within its structure. Its structure is implemented in an RDBMS or as a collection of files in a file system. With its comprehensive GIS data model, geospatial modeling capabilities, and scalable architecture, the geodatabase is the foundation that enables the assembling of intelligent geographic information systems that can be adapted for many different GIS businesses and other GIS applications.
For more information on the geodatabase, visit www.esri.com/geodatabase.
Please see the related poster [PDF].