ArcUser Online
 

July - September 2007
Search ArcUser
 
ArcUser Main Current Issue Previous Issues Subscribe Advertise Submit An Article
 

E-mail to a Friend

The EPA Metadata Editor
By Michelle Torreano, U.S. Environmental Protection Agency, and Jessica L. Zichichi, Innovate!, Inc.

A simple framework for consistent metadata production agencywide

click to enlarge
A program called the Three Tab Editor, an extension to the ArcGIS ArcCatalog framework developed by the Coeur d'Alene Tribe, provided the basis for developing the EPA's metadata tool. It simplifies the editing environment by reducing the number of tabs used from seven to three.

The U.S. Environmental Protection Agency (EPA) recently implemented a geospatial metadata portal, the GeoData Gateway (GDG), that allows users to discover and access geospatial resources stored throughout the agency. The effectiveness of a geospatial portal is dependent on the quality of the metadata it contains. Consequently, EPA was keenly interested in simplifying and improving the metadata production process to maximize the utility of this resource. In addition, the agency was spending thousands of dollars each year on developing and managing approximately 2,000 metadata records. It was clear that a streamlined metadata production process would reduce the costs and staff time required. To meet its need, the agency developed an EPA Metadata Editor that allows users to quickly create metadata that meets all agency requirements and still produces high-quality information that is easily added to the GDG catalog.

Geospatial Metadata at EPA

One of the ways EPA fulfills its mission to protect human health and the environment is by developing and enforcing regulations and performing environmental research and assessments. Geographic information describing the locations of natural or man-made features on the earth is vital to these activities. However, data sharing between EPA's two dozen subordinate organizations is often difficult. Because geospatial data has a prominent role in agency decision making, it is critical that it be properly documented.

Three factors influenced geospatial metadata development at EPA:

  • Agency documentation concerns with positional accuracy, common keywords, and use constraints
  • Esri's publishing requirements and search tool specifications for easy data search and discovery that promote data reuse and sharing
  • Regulatory and legislative mandates for federal agencies that require metadata documentation
click to enlarge
The initial, rapid deployment of an EPA version of the Three Tab Editor gave EPA users a simple editing tool that included EPA defaults. It also served as a beta version that could provide feedback for refining requirements for the second, more robust deployment.

Circular A-16 requires EPA to create metadata for geospatial datasets that comply with the minimum Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC CSDGM) and submit geospatial metadata to national catalogs supporting the National Spatial Data Infrastructure (NSDI). These factors overlap in some areas and diverge in other areas. [Circular A-16 was issued in 2002 by the Office of Management and Budget on the coordination of geographic information and related spatial data activities. Executive Order 12096 is a presidential mandate to all federal agencies ordering them to use a specified standard to document geospatial data created as of January 1995.] EPA faced a daunting task of making sense of inconsistent requirements while reducing time spent on geospatial metadata development.

Different approaches to geospatial metadata creation pointed to the need for a single EPA-approved standard that met all three metadata production factors simultaneously. In 2005, EPA developed an EPA implementation of the FGDC CSDGM that would serve as a single document addressing the various end uses that geospatial metadata supports, including agency requirements, search tool specifications, and minimum FGDC requirements.

The EPA implementation augments the FGDC CSDGM standard by providing consistent language for certain metadata elements and requiring the use of some elements that are not considered mandatory for minimum FGDC validation. It does not create new elements—rather it provides standard interpretation guidelines for existing elements. Metadata that is compliant with the EPA implementation is also compliant with minimum FGDC requirements and Esri search tool requirements. Metadata published according to the EPA implementation will be publishable to NSDI clearinghouses and can be used for internal search and discovery applications.

click to enlarge
At version 2.0, the EPA Metadata Editor features user interface components that follow the structure of the EPA implementation of the FGDC CSDGM and incorporates the use of a spell-checker and an EPA-validation service.

Pursuant to developing the EPA implementation, EPA reviewed various approaches taken to address EPA metadata editing. The results showed that EPA personnel had access to a number of different commercial off-the-shelf (COTS) and custom-developed geospatial metadata editing tools, but none met all the requirements for developing EPA-specific geospatial metadata. For example, the Esri Portal Toolkit metadata creation interface produces non-FGDC compliant metadata. ArcCatalog metadata development produces records with a number of inconsistencies with FGDC requirements. Neither tool provided EPA defaults for FGDC elements that made the process of developing consistent metadata onerous. Custom tools, while providing EPA defaults and streamlined editing capabilities, were still being developed and required further finalization and integration.

The Approach

The agency set out to create a metadata editor that met EPA geospatial metadata requirements. The EPA Metadata Editor was developed using an approach that included an initial requirements-gathering phase, an overall design phase, a testing phase, and an implementation and training phase. In gathering initial design requirements, three key considerations needed to be addressed. Metadata produced using the application must be compliant with all of EPA's metadata requirements. The application must work within EPA's existing geospatial environment so that it may be easily incorporated into analysts' daily operations. The application should maximize reuse of existing applications and tools while also being reusable to the maximum extent possible within other technologies.

Given these objectives, EPA developed the requirements for the design of the application by first assessing the existing geospatial environment and typical workflow for geospatial analysts. After reviewing the available metadata editing tools to determine applicability for meeting EPA's metadata requirements, the agency interviewed EPA geospatial analysts to develop the formal requirements for the application.

A key objective was ensuring that the EPA Metadata Editor would integrate within the agency's existing geospatial software environment. EPA personnel are most familiar with the ArcGIS environment and its tools for performing their daily operations. In addition, the agency has an Enterprise License Agreement with Esri, so Esri software is prevalent across EPA's geospatial community. However, some analysts use other GIS tools on a daily basis and might not be familiar with the Esri environment.

Given this environment, EPA initially considered both a Web-based and desktop application integrated with the Esri environment. Each design approach had advantages. A Web-based tool can be accessible to users across EPA without requiring them to have or install any special software locally. A desktop tool can be integrated within other software and tools and be more easily integrated with the workflows of the geospatial analysts. After interviewing key personnel, it was determined that a desktop version of the editor was preferred by most users. Furthermore, because of the prevalence of Esri products within EPA, an extension to the ArcCatalog environment was preferred, as it would be easily incorporated into other workflow activities.

Another objective was maximizing the use of freely available tools and/or software while meeting EPA's needs for metadata. EPA performed an initial review of existing metadata editing tools to evaluate leveraging opportunities. During this review, the agency's geospatial users found a program developed by the Coeur d'Alene Tribe of Idaho called the Three Tab Editor. This program, developed as an extension to the ArcGIS ArcCatalog framework, simplifies the default editing environment by reducing the number of tabs used to edit metadata from seven to three. The simple design and availability of source code made the existing Three Tab Editor a good starting place for the development of the EPA Metadata Editor. After EPA personnel favorably reviewed the software, EPA decided to use the Three Tab Editor design as a basis from which to develop the EPA Metadata Editor.

EPA decided to approach this as a multiphase development project that would include two major releases. An initial beta release of the program included a preliminary set of simple updates to the existing Three Tab Editor code base that included EPA defaults within the user interface. Providing an initial, rapid deployment of an EPA version of the Three Tab Editor allowed EPA to present its users with a simple editing tool that included EPA defaults and also allowed them to test the beta version and provide feedback for refining requirements for the second, more robust deployment. The full set of requirements was gathered during testing of the beta version to create version 2.0 of the program that would meet all of EPA's needs.

During testing of the initial beta version, the design team interviewed users to develop the requirements for the second phase of the program. The major design requirements for version 2.0 involved creating a more robust, EPA-specific implementation of the simple Three Tab Editor interface favored by many of the geospatial analysts. Although most EPA geospatial analysts use ArcGIS Desktop on a daily basis, most agreed that a Web-based version would also be useful. It would provide a tool that would be accessible to non-traditional GIS users in the future.

Many of EPA's geospatial analysts needed to be able to edit default values provided within the user interface to meet their specific needs. A database-driven environment that could be readily edited by a number of different users would be useful for providing this flexibility. Many users indicated that the use of visual cues within the user interface to identify EPA required fields easily would be helpful for understanding which fields were critical. Many users felt using an EPA validation tool that would allow them to easily determine whether or not their metadata passed EPA requirements would be highly valuable. Some EPA geospatial analysts identified the need for a spell-checker.

Photo courtesy of Dave Hansen, United States Environmental Protection Agency Great Lakes National Program Office

As part of its mission to protect human health and safeguard the natural environment, EPA works with other federal agencies, state governments, tribes, and other organizations, as well as Canada, to preserve the Great Lakes. This photo shows the North Shore stream flowing into Lake Superior.

Based on these factors, EPA Metadata Editor version 2.0 was developed as a modified Three Tab Editor design that incorporated a number of changes to the original program to meet EPA's needs for metadata. A simple Microsoft Access database was used to populate fields in the user interface. The database design can be easily modified by EPA users. User interface components follow the structure of the EPA implementation of the FGDC CSDGM, and required elements are easily identifiable using visual cues within the interface. The program also incorporates the use of a spell-checker and an EPA validation service. It was written using VB.NET so that the deployment of a Web-based front end may be an option for a future release. The second release of the EPA Metadata Editor consisted of a complete rewrite of existing code that provided users with the ability to meet all of EPA's metadata requirements within the user interface.

The EPA Metadata Editor was developed over a period of nine months. The beta program was deployed in August 2006, and version 2.0 of the program was deployed in April 2007. The program is currently being implemented across EPA offices.

Benefits of the EPA Metadata Editor

It has been estimated that the EPA Metadata Editor will save the agency approximately $200,000 a year by reducing the staff time needed for metadata management. The EPA Metadata Editor provides a standardized enterprise-wide, one-stop approach to developing metadata. In addition, it integrates with the Esri environment, which is familiar to many EPA personnel. It simplifies the user interface within ArcCatalog using three tabs instead of seven and contains EPA-specific defaults and drop-down menus. The EPA Metadata Editor provides appropriate guidelines and instructions for entering necessary information in various data fields. The editor uses technology that supports future integration with a Web-based front end. The source code can be reused by other developers.

These features allow EPA staff to create consistent, simplified, high-quality geospatial metadata. High-quality geospatial metadata increases adherence to federal requirements for geospatial metadata (FGDC CSDGM), as well as enhances adherence to and understanding of EPA standards for geospatial metadata (EPA Implementation of the FGDC CSDGM), and improves search and discovery within EPA's geospatial metadata portal (GDG). This leads to better data reuse within the agency and improved submissions to intergovernmental data sharing initiatives including NSDI and Geospatial One-Stop (GOS).

Next Steps

Outreach and training for the tool, provided through regular workshops, conferences, and a detailed hands-on training program throughout summer and fall 2007, ensure that users have adequate instruction on its use. Feedback received from new users will be retained and addressed in the event that a future release or a Web-based version is deployed. EPA has chosen to make the source code available for users at geodata.epa.gov, and the agency welcomes feedback from the user community about this tool. For more information, contact

Michelle Torreano, MC 2823T
United States Environmental
Protection Agency
1200 Pennsylvania Avenue, NW
Washington, D.C. 20460
Tel.: 202-566-2141
E-mail: torreano.michelle@epa.gov

About the Authors

Michelle Torreano is an environmental protection specialist with the U.S. EPA's Office of Environmental Information in Washington, D.C. During her six years in the Office of Environmental Information, she has worked on a variety of projects to advance the agency's national geospatial program, including system management, geospatial data acquisitions, and policy development. She graduated from Purdue University with a degree in natural resources and environmental science.

Jessica L. Zichichi has been working in GIS for more than 10 years. She holds a master's degree in computer science and bachelor's degrees in computer science and environmental studies. Her GIS experience ranges from basic mapping and cartographic production to desktop and Web-based geospatial application development and programming. Her recent GIS efforts have been focused on enterprise geospatial solutions, geospatial metadata, and policy and planning.

Contact Us | Privacy | Legal | Site Map