ArcGIS Blog

Data Management

Apps

Exploring Tools in ArcGIS Data Pipelines

By Chad Lopez and Albert Schelin

ArcGIS Data Pipelines is a data preparation and integration app in ArcGIS Online and now ArcGIS Enterprise. Save time preparing data for visualization and analytics with a no-code, visual data engineering app. Connect to various data sources, apply commonly used data preparation tools, and write results to your content that can be automated to keep data up to date and reliable for decision making.

As the title suggests, this blog will focus on the data preparation tools within ArcGIS Data Pipelines. There are two main goals of this blog. The first goal of this blog is to provide an overview of each of the data processing tools available within the app, with helpful context and descriptions that demonstrate what each tool does. The second goal of this blog is to help you decide which tools to use as you build out your own data pipeline workflows, saving you time or serving as a source of inspiration.

This blog is up to date as of the February 2026 update to ArcGIS Online and ArcGIS Enterprise 12.1 release of ArcGIS Data Pipelines.

To see ArcGIS Data Pipelines in action, refer to the video below.

See ArcGIS Data Pipelines in Action 

In the demo video above, we brought in a CSV file representing customer locations from Amazon S3 and applied a series of tools to prepare the data for use in ArcGIS. We used data preparation tools to make the data spatial (Create geometry), filter records to only those within California (Clip), and remove personally identifiable information (Select fields). To automate data updates and ensure the data in ArcGIS Online remained up-to-date, we ran the data pipeline and scheduled it to run on a weekly basis.

Now that we understand what Data Pipelines is and how it fits into the broader ArcGIS ecosystem, let’s briefly explore the input data sources the app supports, before learning about how each tool group empowers you in transforming your data into results that are ready for mapping and analytics.

Bring Your Data into ArcGIS Data Pipelines

Data Pipelines supports a wide variety of input data sources. You can connect to files from your content, from a URL or API, from a file share (in the case of ArcGIS Data Pipelines in ArcGIS Enterprise), or from a cloud storage container such as an Amazon S3 bucket or Microsoft Azure Storage container.

We also support reading in tables from Snowflake, Google BigQuery, and Databricks (only currently available in ArcGIS Online), in addition to feature layers from ArcGIS Online or ArcGIS Enterprise. While this blog is focused more on tools, you can read more about the supported input data sources here:

Transform and Prepare Your Data

ArcGIS Data Pipelines includes all the common data preparation tools needed by GIS users to prepare their data for use in visualization and analysis workflows within ArcGIS Online and ArcGIS Enterprise. Together, they allow data to be efficiently transformed into an optimal state for downstream workflows.

Tools are organized into four categorical toolsets: clean, construct, format, and integrate.

Clean

Focuses on improving data quality and removing unnecessary or redundant information.

  • Clip – Extracts records within a boundary.
    Example: Clip building data to a single city’s boundary for a focused analysis.
  • Filter by attribute – Selects records matching an expression.
    Example: Filter parcels where LandUse = ‘Residential’.
  • Filter by extent – Keeps records inside a defined box.
    Example: Limit tree data to a park boundary.
  • Remove duplicates – Remove duplicate records based on chosen fields.
    Example: Remove wells that have the same Well_ID to ensure each well is represented once.
  • Select fields – Retains only the attributes you need.
    Example: Keep Name, Type, and Height_m fields from a larger dataset.
  • Simplify geometry – Reduces vertex density to make records lighter.
    Example: Simplify watershed polygons for display performance.

Construct

Used to create new information or structure in your datasets.

  • Calculate field – Adds or updates attribute values using Arcade expressions.
    Example: Compute area in square kilometers with Area($record.GEOMETRY).
  • Create date time – Converts one or more fields into a datetime field.
    Example: Merge Year, Month, and Day fields into a single SurveyDate field.
  • Create geometry – Builds points, lines, or polygons from input fields containing coordinate or spatial information.
    Example: Create a point geometry field from a CSV containing Longitude and Latitude coordinate fields.

Format

Helps standardize structure, schema, and projection for downstream use.

  • Map fields – Maps columns to match the schema of another dataset.
    Example: Map source fields such as “Category,” “Type,” or “Event_Code” to a standardized “Incident_Type” field in the target.
  • Pivot – Converts row values into columns for wide-format data.
    Example: Transform monthly sales records into a single row per store with columns for each month.
  • Project geometry – Projects data into a new coordinate system.
    Example: Convert data from WGS84 to NAD83 UTM for accurate distance measurements in downstream analysis.
  • Unnest field – Extracts values from lists or objects into individual columns.
    Example: Split a “tags” array into separate attribute fields.
  • Update fields – Convert field types and modify field names.
    Example: Standardize County names to ensure consistent capitalization.

Integrate

Designed for combining, summarizing, and consolidating datasets.

  • Dissolve – Merges polygons or polylines that overlap or share common attributes into a single polygon or polyline.
    Example: Merge parcel polygons that share a common ZoningType.
  • Join – Combine disparate datasets into a single output based on matching attributes, spatial relationships, temporal relationships, or a combination of the three.
    Example: Join demographic tables to census tract polygons using a shared GEOID field.
  • Merge – Appends multiple datasets with similar schemas.
    Example: Merge four quarterly inspection datasets into one annual dataset.
  • Summarize attributes – Groups and calculates statistics.
    Example: Summarize tree counts by species to identify the most common types citywide.

To learn more about each of the data processing tools, refer to the documentation

Summary

ArcGIS Data Pipelines is an easy-to-use data integration app that has powerful tools for data preparation. Each of the tools in Data Pipelines help you transform your data, optimizing its readiness for downstream use in ArcGIS.

This blog provided sample workflows that highlighted some of the tools in action and it helped to define each tool and a common use case for it. If you are new to ArcGIS Data Pipelines, use this blog as a resource to understand how you might design your first data pipeline workflow. If you are an existing user, use this blog to gain inspiration and insight into tools you may not have used before.

Additional Resources

Whether you’re just getting started with ArcGIS Data Pipelines or looking to sharpen your skills, we’ve got you covered. Here are some helpful resources to explore:

  • Explore what’s new: Check out the What’s New documentation for ArcGIS Online  and in ArcGIS Enterprise to see the latest updates and enhancements.
  • Dive into other ArcGIS Data Pipelines blog posts that highlight use cases, tips, and best practices.
  • Try it yourself: Follow this step-by-step tutorial to build a robust data integration workflow that shows off a variety of tools.
  • We’re always listening! If you have ideas, questions, or feedback, drop by the Data Pipelines Community and let us know what you’d like to see next.

Share this article