Deep Learning + GIS = Opportunity

The field of artificial intelligence (AI) has progressed rapidly in recent years, matching or, in some cases, even surpassing human accuracy at tasks such as image recognition, reading comprehension, and translating text. The intersection of AI and GIS is creating massive opportunities.

In image classification, the computer assigns the label “cat” to an image of a cat (left). The computer classified the image (right) as a dense crowd.

AI, machine learning, and deep learning are helping us make our world better by increasing crop yields through precision agriculture, understand crime patterns, and predicting when the next big storm will hit so we can be better equipped to handle it.

Broadly speaking, AI is the ability of computers to perform tasks that typically require some level of human intelligence. Machine learning is one type of engine that makes this possible. It uses data-driven algorithms that learn from data to give you the answers that you need. One type of machine learning that has emerged recently is deep learning. Deep learning uses computer-generated neural networks, which are inspired by and loosely resemble the human brain, to solve problems and make predictions.

Machine Learning in ArcGIS

Machine learning has been a core component of spatial analysis in GIS. Its tools and algorithms have been applied to geoprocessing tools to solve problems in three broad categories: classification, clustering, and prediction. With classification, you can use vector machine algorithms to create land-cover classification layers. Clustering lets you process large quantities of input point data, identify the meaningful clusters within this data, and separate meaningful clusters from the sparse noise. Prediction algorithms, such as geographically weighted regression, give you the ability to model spatially varying relationships. These methods work well in several areas. Their results are interpretable, but they need experts to identify or include those factors (or features) that affect the outcome being predicted.

The Rise of Deep Learning

Wouldn’t it be great if the machine figured out what those factors/features should be just by looking at the data? That’s where deep learning comes in. In a deep neural network, there are neurons that respond to stimuli and are connected to each other in layers. Neural networks have been around for decades, but it has been a challenge to train them.

The advent of deep learning can be attributed to three primary developments in recent years—availability of data, fast computing, and algorithmic improvements:

Data: We now have vast quantities of data, thanks to the Internet, the sensors all around us, and the numerous satellites that are imaging the whole world every day.

Computing: With cloud computing, we have powerful computational resources. Graphics processing units (GPUs) have become more powerful than ever and gone down in price, thanks to the gaming industry.

Algorithmic improvements: Finally, researchers have now cracked some of the most challenging aspects of training deep neural networks through algorithmic improvements and network architectures.

In object detection, the computer finds objects within an image.

Applying Computer Vision to Geospatial Analysis

One area of AI where deep learning has done exceedingly well is computer vision, or the ability for computers to see. This is particularly useful for GIS because satellite, aerial, and drone imagery is being produced at a rate that makes it impossible to analyze and derive insight through traditional means.

Image classification, object detection, semantic segmentation, and instance segmentation are some of the most important computer vision tasks that can be applied to GIS. The simplest task is image classification. For example, the computer assigns the label “cat” to an image of a cat. In GIS, this classification is used to categorize geotagged photos. Another image, classified as “dense crowd,” can be used by GIS for pedestrian and traffic management planning during public events.

In object detection computer also finds an object's location.

With object detection, the computer needs to find the objects within an image as well as their location. This is a very important task in GIS because it finds what is in a satellite, aerial, or drone image, locates it, and plots it on a map. This task can be used for infrastructure mapping, anomaly detection, and feature extraction.

Another important computer vision task is semantic segmentation. Each pixel of an image is classified as belonging to a specific class. In GIS, semantic segmentation can be used for land-cover classification or the extraction of road networks from satellite imagery.

In semantic segmentation, each pixel of an image is classified as belonging to a specific class. Semantic segmentation can be used for extracting road networks from satellite imagery.

An early example of the use of semantic segmentation and its impact is the success the Chesapeake Conservancy has had in combining Esri’s GIS technology with the Microsoft Cognitive Toolkit (CNTK) AI tools and cloud solutions to produce the first high-resolution land-cover map of the Chesapeake watershed. This work is available on GitHub and can be deployed on a Microsoft Data Science Virtual Machine (DSVM) on Azure. [Read “Tackling a Monumental Project” in the Spring 2018 issue of ArcUser to learn more about the Chesapeake Conservancy project.]

Another type of segmentation is instance segmentation. You can think of this as a more precise object detection in which the precise boundary of each object instance is marked out. Instance segmentation can be used for tasks like improving basemaps. This can be done by adding building footprints or reconstructing 3D buildings from lidar data. Esri recently collaborated with NVIDIA to use deep learning to automate the manually intensive process of creating complex 3D building models from aerial lidar data for Miami-Dade County in Florida.

Instance segmentation is a more precise type of object detection. The precise boundary of each object instance is marked out.

Deep Learning for Mapping

In working with satellite imagery, one important application of deep learning is creating digital maps by automatically extracting road networks and building footprints. Imagine applying a trained deep learning model on a large geographic area and producing a map containing all the roads in the region, then having the ability to create driving directions using this detected road network. This can be particularly useful for developing countries that do not have high-quality digital maps or in areas where newer developments have been built.

Good maps need more than just roads—they need buildings. Instance segmentation models like Mask R-CNN are particularly useful for building footprint segmentation and can help create building footprints without any need for manual digitizing. However, these models typically result in irregular building footprints that look more like masterpieces by the Spanish architect Antoni Gaudí than regular buildings with straight edges and right angles. Using the Regularize Building Footprint tool in ArcGIS Pro can help restore the straight edges and right angles necessary for an accurate representation of building footprints.

Instance segmentation can be used for improving basemaps by adding building footprints.

Integrating ArcGIS with AI

ArcGIS has tools to help with every step of the data science workflow including data preparation and exploratory data analysis; training the model; performing spatial analysis; and finally, disseminating results using web layers and maps. To add context and depth to your analyses, you can use content from Esri’s ArcGIS Living Atlas of the World. This large collection of Esri-curated and partner-provided imagery can be critical to a deep learning workflow.

ArcGIS Pro includes tools for helping with data preparation for deep learning workflows and has been enhanced for deploying trained models for feature extraction or classification. ArcGIS Image Server in the ArcGIS Enterprise 10.7 release has similar capabilities, providing the ability to deploy deep learning models at scale by leveraging distributed computing.

The arcgis.learn module for ArcGIS API for Python on GitHub ( enables GIS analysts and data scientists to train deep learning models with a simple, intuitive API. ArcGIS Notebooks provides a ready-to-use environment for training deep learning models.

ArcGIS includes built-in Python raster functions for object detection and classification workflows using CNTK, Keras, PyTorch,, and TensorFlow. Additionally, you can write your own Python raster function that uses your deep learning library of choice or specific deep learning model/architecture. See a handy guide on GitHub at to get started.

Deep learning is a rapidly evolving field that allows data scientists to leverage cutting-edge research while taking advantage of an industrial-strength GIS. Python, chosen as the primary programming language of popular libraries such as TensorFlow, PyTorch, and CNTK, has emerged as the lingua franca of the deep learning world. ArcGIS API for Python and ArcPy, a Python site package, are a natural fit for integrating with these deep learning libraries that give you more capabilities.

While the examples in this article have focused on imagery and computer vision, deep learning can be used equally well for processing large volumes of structured data such as observations from sensors or attributes from a feature layer. Applications of such techniques to structured data include predicting the probability of accidents, sales forecasting, and natural language routing and geocoding.

Esri is investing heavily in these emerging technologies and has started a new R&D center in New Delhi, focused on AI and deep learning on satellite imagery and location data.

About the author

Rohit Singh

Rohit Singh is the managing director of Esri's R&D Center in New Delhi and leads the development of data science, deep learning and geospatial AI solutions in the ArcGIS platform. He is passionate about deep learning and its intersection with geospatial data and satellite imagery and has been recognized as an Industry Distinguished Lecturer for the IEEE- Geoscience and Remote Sensing Society (GRSS). Rohit is a graduate of the Indian Institute of Technology, Kharagpur, and has worked at computer vision startups and IBM before joining Esri. He conceptualized, designed and developed the ArcGIS API for Python, ArcObjects Java, ArcGIS Engine Java API and ArcGIS Enterprise (Linux) while at Esri.