Data Pipelines (beta) is a new data engineering and integration application that was released in ArcGIS Online in June 2023. The Data Pipelines development team has been listening closely to your feedback and have implemented new features based on your requests. New features include scheduled data pipeline workflows, and an easier way to configure the schema for delimited datasets. Keep reading to learn more!
Scheduled data pipelines
Data is constantly changing and growing. Working with the latest data can be critical in providing meaningful maps and dashboards, conducting accurate analysis, and making informed decisions. We’re excited to announce the ability to schedule data pipeline tasks so you can automatically keep your data up to date.
You can schedule data pipelines to run at specific intervals, such as every hour, at a certain time of day, or monthly. For example, let’s say your data pipeline ingests data from a public URL that updates every night at 6 pm. You can configure a task to run every day in the evening to update your feature layer with the newest data. You can use the Manage scheduling page to create, edit, and view tasks. For each task that is run, you can view results, the run duration, inspect any messages returned from the run, and more.
Watch the video below for a quick tour of scheduling.
To learn more about scheduling, see the help topic Schedule a data pipeline task.
Another new feature is the ability to configure the schema for datasets in CSV and other delimited formats. This gives you more control over your data and how you want to represent fields throughout your workflow and in the output feature layer or table. With the new experience for configuring schema, you can modify field names, field types, and choose which fields you want to include or exclude. You can return to the schema configuration dialog and modify these properties at any time.
For now, schema configuration is only available for delimited file formats, such as CSVs. Unlike other formats supported in Data Pipelines, delimited files do not have field types defined in the source dataset; schema configuration enables you to define the types in the Data Pipelines app. For all other dataset formats, you can continue to use the tool Update fields to modify field names and types, and the tool Select fields to exclude fields from your dataset.
In addition to scheduling and schema configuration, there are other new enhancements including:
- Support for writing big integers and date only fields
- A new option to calculate summary statistics in the Join tool
- Improved error reporting
- And more!
To learn more about the new features and enhancements in Data Pipelines, see the What’s New topic in the documentation.
We want to hear from you! There is much more to come in future updates of Data Pipelines, and we value your feedback on what we should do next. Share your ideas or ask us a question in the Data Pipelines community.