ArcGIS Pro

End-to-end spatial data science 1: Clustering US Precipitation Regions


This blog series is based on the recent research paper “Delineating precipitation regions of the contiguous United States from cluster analyzed gridded data”, from a 2021 issue of the Annals of the American Association of Geographers (Marston and Ellis, 2021).  The goal of the paper is to create a map of climate regions in the United States based on 30 years of historical precipitation data.  My goal is to use a combination of Esri technology and open-source technology to reproduce the workflow and replicate the results.  Throughout the following blogs, we’ll perform the entire analysis using:

The problem

Humans have created and used climate region maps to help understand climate geography and its impact on agriculture, navigation, weather and natural hazards, where we live, where we travel to, etc. since the time of the Ancient Greeks.  Within the past 150 years, several climate region classifications have been proposed, each with variations in their input data, methodology, and purpose.  For example, the extremely well-known Köppen classification divides Earth into climate regions based on precipitation and temperature, while the Thornthwaite classification outlines regions based on a location’s precipitation and evapotranspiration characteristics.  More recently, the National Centers for Environmental Information (NCEI) produced a 9-region climate map of the US based on county-level climate characteristics, which were spatially aggregated first to the US state level, and then to the region level.

The many factors that influence climate—temperature, precipitation, elevation, latitude, proximity to oceans—however, do not adhere to geopolitical boundaries, so the NCEI map may not represent the true climate geography of the United States.  For example, the Northeast, Southeast, and Ohio Valley are relatively homogeneous in mean annual precipitation, while there is much more intra-region heterogeneity in the Northwest and West.  The range of average annual precipitation across the Northwest is several hundred millimeters!

Map showing average annual precipitation (mm) from 1981-2010 in the contiguous United States. The 9-region NCEI map is overlaid for reference.

In the paper, the authors use a novel, data-driven approach to create a modern climate region map based on seasonal precipitation trends in the US for the 30-year period of 1981-2010.  They calculate a total of 16 different precipitation variables (4 variables x 4 seasons) for each location in a gridded dataset of daily precipitation data, then use a series of machine learning techniques for dimensionality reduction and cluster analysis.  The final clusters represent climate regions with similar seasonal precipitation signatures over the 30-year period.

The final map of 13 precipitation regions resulting from the cluster analysis.

What’s next

In this series of blog articles, I’ll walk you step-by-step through my process of reproducing the analysis in this paper including the data ingest, preparation, engineering, and machine learning.  Hopefully at the end you’ll have not only learned a bit about climate geography in the US, but more importantly how to leverage open-source Python and R with ArcGIS Pro and ArcGIS Notebooks to complete an end-to-end spatial data science project.

Spatial data science workflow performed in the blog series.

References (for the blog series)

Marston, M.L., Ellis, A.W., 2021. Delineating precipitation regions of the contiguous United States from cluster analyzed gridded data. Annals of the American Association of Geographers. 111(6), 1721-1739.

Hair Jr., J.F., Anderson, R.E., Tatham, R.L., Black, W.C., 1998. Multivariate Data Analysis (5th Edition). Prentice-Hall, Inc., Upper Saddle River, NJ.

O’Sullivan, D., Unwin, D.J., 2003. Geographic Information Analysis. John Wiley & Sons, Hoboken, NJ.

About the author

Nick Giner is a Product Manager for Spatial Analysis and Data Science. Prior to joining Esri in 2014, he completed Bachelor’s and PhD degrees in Geography from Penn State University and Clark University, respectively. In his spare time, he likes to play guitar, golf, cook, cut the grass, and read/watch shows about history.

Notify of
Inline Feedbacks
View all comments

Next Article

End-to-end spatial data science 5: Machine learning: Cluster analysis in Python and ArcGIS

Read this article