ArcNews

GeoAI

Spring 2026

GeoAI in the Age of Foundation Models

article image

The rapid progress of AI has produced a new category of models that promise to revolutionize Earth observation. Foundation models, characterized by their massive scale and diverse training data, teach computers to understand the planet with unprecedented depth. Esri is at the forefront of this transformation, bringing these powerful models into ArcGIS workflows.

What Foundation Models Are

A foundation model is a large, deep neural network—a type of machine learning that processes complex, nonlinear data to recognize patterns. Foundation models often have billions of parameters and are trained on vast, diverse datasets.

Once trained, a foundation model can be adapted to a wide variety of downstream tasks, depending on the data format. For imagery, the tasks include object detection and tracking, along with pixel and feature classification. For natural-language text data, a foundation model can classify, transform, and extract entities from text. For vector, tabular, and time series data, the downstream tasks include prediction—including regression and classification—and forecasting.

The Building Blocks

The success of foundation models rests on several key innovations. Transformers—neural network architectures that process sequential data in parallel rather than sequentially—capture relationships in data, scale well, and are now widely adopted across domains beyond text. Autosupervised learning enables models to learn from raw, unlabeled data rather than requiring labeled examples. For imagery, this often involves masked image modeling, where parts of an image are hidden and the model learns to reconstruct them. Scale completes the picture: As a model’s size and dataset volume grow, the model’s predictive accuracy improves.

Embeddings as Geospatial Data

Foundation models learn to create embeddings, which are numerical representations of words, images, and other data. In geospatial models, embeddings encapsulate the essential properties of each location, such as geographic coordinates and contextual information like environmental factors.

Also, embeddings can be stored as feature layers in ArcGIS, where each geographic feature carries its own multidimensional embedding as part of its attributes. Machine learning tools in ArcGIS can use such embedding datasets for clustering; classification; regression; or other analysis tasks, including predicting places where certain things—like species or buildings—might be found.

A dot density map of the United States with red dots concentrated in the Midwest and Great Plains, and blue dots clustered in the South, along the East and West Coasts, and in major cities.
Esri is developing new tools and workflows that simplify the use of geospatial foundation models and location embeddings (shown here on a map of the United States).

Remote Sensing Foundation Models in ArcGIS

Remote sensing foundation models are large-scale computer vision models designed to extract insight from satellite and aerial imagery. These models use Vision Transformer architectures that are trained via autosupervised learning on vast collections of satellite imagery. Unlike traditional models trained on everyday photos, these are specifically pretrained on optical, radar, lidar, and multispectral images of Earth’s surface.

ArcGIS integrates several innovative remote sensing foundation models as ready-to-use backbones for geospatial deep learning. These models include Prithvi, Dynamic-One-For-All, and Clay. Esri is also developing its own model, designed to perform effectively across multispectral and high-resolution satellite imagery.

Location-Embedding Models

Location-embedding foundation models represent geographic coordinates as high-dimensional feature embeddings. These embeddings encode spatial context, capturing human-environment interactions, natural geography, and socioeconomic patterns. Unlike traditional approaches that treat latitude and longitude as simple numbers, these models learn to represent place, not just position.

ArcGIS integrates location embeddings with its AutoML machine learning tool to enhance regression and classification on vector data. Users access this via the Use Location Embeddings option in the Train Using AutoML tool in ArcGIS Pro. Esri is extending this approach to incorporate high-resolution imagery and curated demographic data.

The Road Ahead

As foundation models become central to geospatial science, their integration into ArcGIS marks a major step toward more intelligent, data-driven ways of understanding the world. Users can apply AI-based methods for geospatial analysis via graphical tools in ArcGIS, even if they don’t have specialized programming expertise.

Share this article