The amount of imagery that’s collected and disseminated has increased by orders of magnitude over the past couple of years. Deep learning has been instrumental in efficiently extracting and deriving meaningful insights from these massive amounts of imagery. Last October, we released pre-trained geospatial deep learning models, making deep learning more approachable and accessible to a wide spectrum of users.
These models have been pre-trained by Esri on large volumes of data, and can be used as-is, or further fine tuned to your local geography, objects of interest or type of imagery. You no longer need huge volumes of training data and imagery, massive compute resources, or the expertise to train such models yourself. With the pre-trained models, you can bring in the raw data or imagery and extract geographical features at the click of a button. Here’s a quick video tutorial of this works in ArcGIS Pro:
Based on user requirements, we’ve added several new deep learning models to our arsenal of pre-trained models. With the release of these new models, you now have 20 pre-trained deep learning models that you can use.
Here’s a brief overview of the models we’ve added.
The building footprints extraction model we’ve developed for the United States is the most popular model so far. We are extending support for building detection in different countries and continents. This generic deep learning model is used to extract building footprints in Africa from high-resolution (10–40 cm) imagery.
Building footprint layers are useful in preparing base maps and analysis workflows for urban planning and development, insurance, taxation, change detection, infrastructure planning, and a variety of other applications.
This deep learning model estimates residential parcel boundaries by interpreting high-resolution (30–40 cm) imagery. Traditionally, parcel mapping has been done using highly accurate surveying techniques, but this can be expensive and time consuming. High resolution imagery is increasingly being used for plot delineation and the use of deep leaning models can automate and speed up the process.
Do note that residential parcels are often associated with visible boundaries but since legally valid parcels can be defined without a clearly demarcated boundary, this model only deduces their plausible approximations. This model can be used to create basemaps, which can be further refined by manual editing.
Object tracking plays an important role in highway surveillance, traffic management and urban planning. Manually digitizing the track of an object is a slow process. Tracking objects in motion imagery is a daunting task and managing the annotations can be a bit of a challenge. This model is designed to automate that process and can write out the detections directly into a geodatabase. This model can be natively accessed and used in the Full Motion Video extension in ArcGIS Pro. Here’s a demo from the Esri DevSummit where this model is being used to track a truck as it moves on the freeway:
Swimming pools are an important part of property tax assessment records because they impact the value of the property. Tax assessors at local government agencies often rely on expensive and infrequent surveys, leading to assessment inaccuracies. Finding pools that are not on the assessment roll (such as those recently constructed) is valuable to assessors and will ultimately mean additional revenue for the community. This deep learning model helps automate the task of finding pools from high resolution satellite imagery.
This model can also benefit swimming pool maintenance companies and help redirect their marketing efforts. Public health and mosquito control agencies can also use this model to detect pools and drive field activity and mitigation efforts.
State and local government agencies benefit from obtaining information about rooftop solar photovoltaic (PV) panels. Information such as the location, capacity, and energy production of solar panels for different counties and neighborhoods, is required to understand the uptake of solar energy by communities. On a macro level, government agencies can also use solar panel detection to offer incentives such as tax exemptions and credits to residents that have installed solar panels. Policymakers can use it to gauge adoption and frame schemes to spread awareness and promote solar power utilization in areas that lack its use.
Traditional ways of obtaining information on solar panel installation, such as surveys and on-site visits are time consuming and error prone. Here’s where the value of this model really comes to life, as it allows for easy identification of solar panel installations and computing their size by interpreting high resolution imagery.
This deep learning model can be used to count the number of people in an image. Crowd counting from an image is a highly challenging task due to occlusion, resolution, low quality, and scale variation of objects. With the development of deep learning techniques, various crowd counting methods have been proposed in response to this challenge. This model uses the Distribution Matching for Crowd Counting methodology by Boyu Wang et al. and achieves state-of-the-art results at this challenging task.
AI is been used to manage traffic lights at key crowded intersections. For instance, the Braves organization (Atlanta Braves) collaborated with Cobb County Police Department and Cobb County Department of Transportation (DOT) officials to create a control center that leverages AI and a geographic information system (GIS) to manage traffic lights at this busy intersection:
The crowd counting model can also be used for crowd management, intelligent transportation systems and facility management, with appropriate human oversight.
Car detection can be used for applications such as traffic management and analysis, parking lot utilization, urban planning, etc. It can also be used as a proxy for deriving economic indicators and estimating retail sales. High resolution aerial and drone imagery can be used for car detection due to its high spatio-temporal coverage.
Ship detection is critical for a range of applications, including managing port activity, monitoring cargo activity, maritime rescue, national defense and monitoring illegal fishing. Using synthetic aperture radar (SAR) data for detecting objects provides the added benefit of being able to see through clouds, storms, and more importantly (unlike optical sensors that are limited to capturing imagery during the day), SAR sensors can capture usable data anytime – be it night or day.
We put this model to use recently to monitor the ships stuck near the Suez canal, as a result of the Ever Given ship blocking it. Here are the detections from the model:
These are just a few of the models that have been developed over the past few months to automate and simplify your workflows. Try them out. Some of these models (like the Ship Detection model), require pre/post processing workflows. To simplify these tasks, we have neatly packaged ArcGIS Pro project templates which make the post processing tools easily accessible to you in the form of Geoprocessing tools. Each model includes helpful documentation to get you started.
Have questions? Let us know on GeoNet how they are working for you, and which other feature extraction tasks you’d like AI to do for you!