Can Geography Rescue Text Search?
Using Maps with Text Search
To plot documents on maps, systems must extract text strings that identify locations and compute latitude and longitude coordinates for those references. This process, referred to as geoparsing, uses natural language processing to disambiguate the meaning intended by the author and assign confidence scores that estimate the probability that the system chose the correct latitude-longitude. Using this geographic reference analysis, a GTS system can return a relevant and ranked set of results focused by geography and subject.
Through a cooperative effort, ArcGIS provides a powerful presentation layer for seamlessly merging geography and text searches served by the MetaCarta GTS system. For years, analysts have relied on outstanding Esri tools to help sift through geographic materials, but geographic information buried in text documents has been a challenge. Analysts can now efficiently discover and comprehend geographic references within documents.
Analysts leverage MetaCarta's GTS capabilities through Esri's ArcGIS interface, courtesy of a freely available plug-in linking the two technologies. This plug-in maintains a consistent look and feel to minimize the learning curve. Analysts can perform multiple text searches to narrow down the field of results. This iterative process of adjusting input keywords has become commonplace for all forms of information analysis and retrieval. This plug-in adds a new kind of geographic data asset to ArcMap because each text search creates a new data layer that integrates unstructured text and traditional GIS data.
MetaCarta GTS also integrates with map servers, such as Esri ArcIMS, so analysts can search using a Web browser to display a backdrop map derived from private map data. ArcIMS can hold a central copy of the team's most current data. As this data is modified or updated, the most current version is immediately visible to analysts. By integrating the team's most current geospatial data with text search, analysts can identify documents that are specific to important areas and correlate them with other spatial data layers.
Geography offers a new framework for identifying and analyzing the overwhelming volume of text documents available today. Maps offer a new way to search text documents because the vast majority of these documents contain at least one reference to a specific geographic location. Technology that automatically plots text documents on a map based on geographic references accelerates the ability to create immediate intelligence. Analysts can see geographic trends as they emerge, improving the quality of information known about a given situation.
MetaCarta, Inc., was founded by a team of researchers from the Massachusetts Institute of Technology in 1999 to create powerful new tools to help analysts gain more intelligence from data by synthesizing text and geography. With early support from the Defense Advanced Research Projects Agency and private investors, MetaCarta released MetaCarta GTS as a regular COTS product in 2002. It remains the only geographic text search solution on the market today. It has users in intelligence, defense, law enforcement, Department of Homeland Security, and the energy sector.
About the Authors
Randy Ridley, general manager of federal systems for MetaCarta, is a U.S. naval officer in surface warfare and holds a bachelor's degree in history from the University of Iowa and a master's degree in national security studies from George Washington University. His articles have been published in the Op-Ed pages of the Washington Post and Defense News, and he has spoken on IT matters in national and international theaters. Prior to joining MetaCarta, he held a senior project management position at TASC, Inc.
John-Henry Gross, the public sector product manager for MetaCarta, coordinates activities across the company for developing and marketing MetaCarta's products. His career in product marketing, product management, and business development in the high-tech sector includes positions with Convera, Adobe Systems, Sun Microsystems, Thomson Consulting, and Manugistics. Prior to working in the high-tech sector, Gross was a science and computer programming teacher at schools in New Jersey and Massachusetts. He holds a bachelor's degree in zoology from Drew University and a master's degree in science education from Columbia University. He has also completed coursework in computer programming at Harvard University and Stanford University.
John Frank, founder and president of MetaCarta, Inc., conceived of a new geographic way to view collections of text while working on his doctorate in physics as a Hertz Fellow at the Massachusetts Institute of Technology in 1999. Prior to founding MetaCarta, he worked for IDEO in Palo Alto, California. Frank holds a U.S. government secret-level security clearance and spends time in the field with users and decision makers so he can better understand the intelligence community's evolving requirements. He holds a bachelor's degree in physics, with distinction, from Yale University.