{"id":2190472,"date":"2023-12-14T10:00:15","date_gmt":"2023-12-14T18:00:15","guid":{"rendered":"https:\/\/www.esri.com\/arcgis-blog\/?post_type=blog&#038;p=2190472"},"modified":"2024-10-25T09:54:09","modified_gmt":"2024-10-25T16:54:09","slug":"end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","status":"publish","type":"blog","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","title":{"rendered":"End-to-end spatial data science 2: Data preparation and data engineering using R"},"author":154341,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","format":"standard","meta":{"_acf_changed":false,"_searchwp_excluded":""},"categories":[23341],"tags":[760452,35661,24341,30241,759592],"industry":[],"product":[36841,36561],"class_list":["post-2190472","blog","type-blog","status-publish","format-standard","hentry","category-analytics","tag-data-engineering","tag-machine-learning","tag-python","tag-r","tag-spatial-data-science","product-api-python","product-arcgis-pro"],"acf":{"authors":[{"ID":154341,"user_firstname":"Nicholas","user_lastname":"Giner","nickname":"Nick Giner","user_nicename":"nginer","display_name":"Nicholas Giner","user_email":"NGiner@esri.com","user_url":"","user_registered":"2021-01-07 14:31:25","user_description":"Nick Giner is a Product Manager for Spatial Analysis and Data Science.  Prior to joining Esri in 2014, he completed Bachelor\u2019s and PhD degrees in Geography from Penn State University and Clark University, respectively. In his spare time, he likes to play guitar, golf, cook, cut the grass, and read\/watch shows about history.","user_avatar":"<img data-del=\"avatar\" src='https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2021\/01\/headshot-e1610030307989-213x200.jpeg' class='avatar pp-user-avatar avatar-96 photo ' height='96' width='96'\/>"}],"short_description":"This is the second in a series of blogs that showcase an end-to-end spatial data science workflow for clustering US precipitation regions.","flexible_content":[{"acf_fc_layout":"content","content":"<h2>Introduction<\/h2>\n<p>In this second blog article, we\u2019ll walk through the steps for downloading a 30-year time series of daily precipitation rasters, then processing this raster data into a tabular dataset where each row represents a location in a 4km by 4km grid of the US, along with four different precipitation variables relating to amount, frequency, and variability.\u00a0 These variables are calculated for each season in the 30-year period, totaling 120 seasons.\u00a0 All the following steps are done in R.<\/p>\n<h2>The precipitation data<\/h2>\n<p>The raw data used in this analysis comes from the Parameter-elevation Regressions on Independent Slopes Model (PRISM) dataset, provided by the <a href=\"https:\/\/prism.oregonstate.edu\/\">PRISM Climate Group<\/a> at Oregon State University.\u00a0 The dataset contains a 30-year (1981-2010) time series of daily precipitation (mm) gridded raster data at 4km by 4km spatial resolution.\u00a0 Doing some quick math, this is 30 years x 365 days = ~11,000 daily rasters x ~ 481,631 pixels per raster = ~5.3 billion pixels to be processed!<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2193112,"id":2193112,"title":"precip_dailys","filename":"precip_dailys-2.jpg","filesize":370547,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/precip_dailys-3","alt":"","author":"154341","description":"","caption":"Example maps of daily precipitation rasters for six consecutive days in July 1986.","name":"precip_dailys-3","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 14:01:20","modified":"2023-12-12 14:01:26","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1916,"height":975,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","medium-width":464,"medium-height":236,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","medium_large-width":768,"medium_large-height":391,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","large-width":1916,"large-height":975,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2-1536x782.jpg","1536x1536-width":1536,"1536x1536-height":782,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","2048x2048-width":1916,"2048x2048-height":975,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2-826x420.jpg","card_image-width":826,"card_image-height":420,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/precip_dailys-2.jpg","wide_image-width":1916,"wide_image-height":975}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Because we are using seasonal variation in precipitation over time as our proxy for delineating US climate regions, we need to determine 1) what are our metrics for \u201cseasonal variation\u201d and 2) how do we calculate these metrics from a 30-year time series of ~11,000 daily precipitation rasters?<\/p>\n<p>Seasonal variation in precipitation is represented by 4 different precipitation measures.<\/p>\n<ul>\n<li><strong>Total precipitation<\/strong> \u2013 total millimeters (mm) of precipitation within a season<\/li>\n<li><strong>Frequency of precipitation<\/strong> &#8211; total number of precipitation days within a season<\/li>\n<li><strong>Gini Coefficient<\/strong> \u2013 quantifies the temporal distribution of precipitation within a season by measuring inequality in the amount of precipitation received during precipitation events\n<ul style=\"padding-left: 40px\">\n<li><em>Close to 0<\/em> = less inequality in precipitation across a season, with each precipitation event having similar amounts (more equality), much like a uniform distribution<\/li>\n<li><em>Close to 1<\/em> = more inequality in precipitation across a season, where there is large variation in the amount of precipitation received from event to event<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n"},{"acf_fc_layout":"image","image":{"ID":2510802,"id":2510802,"title":"gini_graphic","filename":"gini_graphic.jpg","filesize":45730,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/gini_graphic","alt":"","author":"154341","description":"","caption":"A Gini coefficient of close to 0 (left) indicates that most precipitation events in a season are nearly equal in amount.  A Gini coefficient of close to 1 (right) indicates large variability in amount of precipitation from event to event.","name":"gini_graphic","status":"inherit","uploaded_to":2190472,"date":"2024-09-30 21:20:13","modified":"2024-09-30 21:23:44","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1065,"height":327,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","medium-width":464,"medium-height":142,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","medium_large-width":768,"medium_large-height":236,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","large-width":1065,"large-height":327,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","1536x1536-width":1065,"1536x1536-height":327,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","2048x2048-width":1065,"2048x2048-height":327,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic-826x254.jpg","card_image-width":826,"card_image-height":254,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_graphic.jpg","wide_image-width":1065,"wide_image-height":327}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<ul>\n<li><strong>Lorenz Asymmetry Coefficient<\/strong> \u2013 quantifies the kind of precipitation events (heavy or light) within a season by measuring the degree of inequality in the temporal distribution of precipitation\n<ul style=\"padding-left: 40px\">\n<li><em>Less than 1<\/em> = inequality in the precipitation across a season caused by many, small precipitation events<\/li>\n<li><em>Greater than 1<\/em> = inequality in the precipitation across a season caused by a few large precipitation events<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n"},{"acf_fc_layout":"image","image":{"ID":2510812,"id":2510812,"title":"LAC_graphic","filename":"LAC_graphic.jpg","filesize":50641,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/lac_graphic","alt":"","author":"154341","description":"","caption":"A Lorenz coefficient of less than 1 (left) indicates that the inequality in precipitation across a season is due to many, small precipitation events.  A Lorenz coefficient of greater than 1 (right) suggests that this inequality is caused by just a few large precipitation events.","name":"lac_graphic","status":"inherit","uploaded_to":2190472,"date":"2024-09-30 21:25:02","modified":"2024-10-25 16:29:25","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1156,"height":313,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","medium-width":464,"medium-height":126,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","medium_large-width":768,"medium_large-height":208,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","large-width":1156,"large-height":313,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","1536x1536-width":1156,"1536x1536-height":313,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","2048x2048-width":1156,"2048x2048-height":313,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic-826x224.jpg","card_image-width":826,"card_image-height":224,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/LAC_graphic.jpg","wide_image-width":1156,"wide_image-height":313}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>For each location, we need to average each of these four precipitation measures for each season, for every year from 1981-2010.\u00a0 This will result in a time series of 120 seasonal averages (4 seasons per year x 30 years) of each variable, which we will then average over the entire time period.\u00a0 Thus, our final dataset will contain the 30-year average of the four precipitation variables for each season (16 total precipitation variables).<\/p>\n<p>That sounds <em>really<\/em> complicated.\u00a0 Let\u2019s walk through the process step-by-step and see how we did it!<\/p>\n<h2>R packages<\/h2>\n<p>In pretty much any R script, it\u2019s very likely that the first few lines of code are for loading the R packages that you need for your task.\u00a0 In this case, I use the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/base\/versions\/3.6.2\/topics\/library\"><em>library <\/em><\/a><\/strong>function to load the handful of packages I need.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194332,"id":2194332,"title":"R_libs (1)","filename":"R_libs-1-1.jpg","filesize":33433,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/r_libs-1","alt":"","author":"154341","description":"","caption":"","name":"r_libs-1","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:00:07","modified":"2023-12-12 17:00:07","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":195,"height":188,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","thumbnail-width":195,"thumbnail-height":188,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","medium-width":195,"medium-height":188,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","medium_large-width":195,"medium_large-height":188,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","large-width":195,"large-height":188,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","1536x1536-width":195,"1536x1536-height":188,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","2048x2048-width":195,"2048x2048-height":188,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","card_image-width":195,"card_image-height":188,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/R_libs-1-1.jpg","wide_image-width":195,"wide_image-height":188}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<ul>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/prism\/index.html\">prism<\/a>} \u2013 API for downloading PRISM climate data from Oregon State University<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/raster\/index.html\">raster<\/a>} \u2013 working with raster data<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/reshape2\/index.html\">reshape2<\/a>} \u2013 reshaping, restructuring, and aggregating tabular data<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/ineq\/\">ineq<\/a>} \u2013 measuring inequality in data distributions<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/reldist\/index.html\">reldist<\/a>} \u2013 methods for comparing data distributions<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/lubridate\/index.html\">lubridate<\/a>} &#8211; working with dates<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/dplyr\/index.html\">dplyr<\/a>} \u2013 working with and manipulating data frames<\/li>\n<li>{<a href=\"https:\/\/cran.r-project.org\/web\/packages\/tidyr\/index.html\">tidyr<\/a>} \u2013 creating \u201ctidy\u201d data (cleaning, wrangling, manipulating data frames)<\/li>\n<\/ul>\n<h2>Accessing the data<\/h2>\n<p>Fortunately, the {prism} package can be used to programmatically access PRISM data.\u00a0 Within this package, there are a series of functions that allow you to choose which variable and temporal scale you want to download.\u00a0 In my case, I was interested in daily precipitation data, so I used the <strong><a href=\"https:\/\/rdrr.io\/cran\/prism\/src\/R\/get_prism_dailys.R\"><em>get_prism_dailys<\/em><\/a><\/strong> function.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2510852,"id":2510852,"title":"prism_download_R","filename":"prism_download_R.jpg","filesize":265160,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/prism_download_r","alt":"","author":"154341","description":"","caption":"The \"type\" argument above allows you to choose from four precipitation variables:  precipitation (ppt), mean temperature (tmean), minimum temperature (tmin), maximum temperature (tmax).  ","name":"prism_download_r","status":"inherit","uploaded_to":2190472,"date":"2024-09-30 21:31:50","modified":"2024-10-01 14:16:04","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":2460,"height":1056,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R.jpg","medium-width":464,"medium-height":199,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R.jpg","medium_large-width":768,"medium_large-height":330,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R.jpg","large-width":1920,"large-height":824,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R-1536x659.jpg","1536x1536-width":1536,"1536x1536-height":659,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R-2048x879.jpg","2048x2048-width":2048,"2048x2048-height":879,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R-826x355.jpg","card_image-width":826,"card_image-height":355,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/prism_download_R-1920x824.jpg","wide_image-width":1920,"wide_image-height":824}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Within this function, you can specify the start and end date of the download period.\u00a0 I knew that downloading and processing ~11,000 rasters at once would not be manageable on my machine (or within R, period), so I would have to download the data and process it in batches.\u00a0 After a bit of testing and experimentation, it appeared that roughly 200 or less rasters in each batch would be a workable size.\u00a0 This combined with the fact that each of the precipitation variables is calculated <em>for every season<\/em> in the 30-year time series, led me to the following pattern for determining the start and end date of each batch:<\/p>\n<ul>\n<li>Winter: December 1 to February 28* = 89 or 90 daily rasters<\/li>\n<li>Spring: March 1 to May 31 = 91 daily rasters<\/li>\n<li>Summer and Fall**: June 1 to November 30 = 182 daily rasters<\/li>\n<\/ul>\n"},{"acf_fc_layout":"sidebar","content":"<p><strong>Note:\u00a0<\/strong><\/p>\n<p>* The winter season ranges from December 1 of year<em> y<\/em> to February 28 of year <em>y<\/em>+1, except for the seven leap years that occurred between 1981-2010 (1984, 1988, 1992, 1996, 2000, 2004, 2008).\u00a0 In these years, the last date in the winter season was February 29.<\/p>\n<p>** The summer and fall seasons were grouped together because they met the criteria for less than 200 rasters in a batch but also preserved the full summer and fall seasons within one calendar year.<\/p>\n","image_reference":false,"layout":"standard","image_reference_figure":"","snippet":"","spotlight_name":"","section_title":"","position":"Center","spotlight_image":false},{"acf_fc_layout":"content","content":"<p>For the purposes of this study, the 30-year time series started on December 1, 1981 (winter 1981) and ended on February 28, 2011 (winter 2010).\u00a0 That means the following processing steps would need to be performed on 90 individual batches (30 winters + 30 springs + 30 summer\/fall combinations).<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2193172,"id":2193172,"title":"daily_rasters","filename":"daily_rasters.jpg","filesize":492181,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/daily_rasters","alt":"","author":"154341","description":"","caption":"Example of the downloaded daily rasters for one batch from summer and fall of 2010.","name":"daily_rasters","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 14:27:59","modified":"2023-12-12 14:28:08","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1344,"height":898,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","medium-width":391,"medium-height":261,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","medium_large-width":768,"medium_large-height":513,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","large-width":1344,"large-height":898,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","1536x1536-width":1344,"1536x1536-height":898,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","2048x2048-width":1344,"2048x2048-height":898,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters-696x465.jpg","card_image-width":696,"card_image-height":465,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/daily_rasters.jpg","wide_image-width":1344,"wide_image-height":898}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<h2>Data engineering: Cleaning and wrangling<\/h2>\n<p>Because our end goal is to have a dataset with 16 precipitation variables at each location, we need to get our data in tabular format.\u00a0 We first use the <strong><em><a href=\"https:\/\/docs.ropensci.org\/prism\/reference\/prism_archive_ls.html\">prism_archive_ls<\/a><\/em><\/strong> function to list (a.k.a. create a vector) of the PRISM rasters that we downloaded in the previous step.\u00a0 We then use the <em><strong><a href=\"https:\/\/docs.ropensci.org\/prism\/reference\/pd_stack.html\">pd_stack<\/a><\/strong><\/em> function to combine the rasters into one RasterStack object.\u00a0 Last, we convert the RasterStack to a data frame via the <strong><em><a href=\"https:\/\/www.rdocumentation.org\/packages\/raster\/versions\/3.6-26\/topics\/rasterToPoints\">rasterToPoints<\/a><\/em><\/strong> function.\u00a0 The result is a data frame of 481,631 locations by 185 columns (for one summer\/fall batch), with two columns representing the x\/y coordinates of each location, and the remaining columns representing the total precipitation (mm) for each day in the time series.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2511602,"id":2511602,"title":"pd_stack","filename":"pd_stack.jpg","filesize":37702,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/pd_stack","alt":"","author":"154341","description":"","caption":"","name":"pd_stack","status":"inherit","uploaded_to":2190472,"date":"2024-10-01 14:20:23","modified":"2024-10-01 14:20:23","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":773,"height":255,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","medium-width":464,"medium-height":153,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","medium_large-width":768,"medium_large-height":253,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","large-width":773,"large-height":255,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","1536x1536-width":773,"1536x1536-height":255,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","2048x2048-width":773,"2048x2048-height":255,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","card_image-width":773,"card_image-height":255,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/pd_stack.jpg","wide_image-width":773,"wide_image-height":255}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2193222,"id":2193222,"title":"raster_to_point_table","filename":"raster_to_point_table-1.jpg","filesize":160392,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/raster_to_point_table-2","alt":"","author":"154341","description":"","caption":"Data frame resulting from the raster to point conversion. Each row represents a location, with each column representing total precipitation (mm) at that location from each daily raster.","name":"raster_to_point_table-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 14:35:23","modified":"2023-12-12 14:35:41","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1084,"height":495,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","medium-width":464,"medium-height":212,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","medium_large-width":768,"medium_large-height":351,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","large-width":1084,"large-height":495,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","1536x1536-width":1084,"1536x1536-height":495,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","2048x2048-width":1084,"2048x2048-height":495,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1-826x377.jpg","card_image-width":826,"card_image-height":377,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/raster_to_point_table-1.jpg","wide_image-width":1084,"wide_image-height":495}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Next, we\u2019ll use the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/reshape2\/versions\/1.4.4\/topics\/melt\"><em>melt<\/em><\/a><\/strong> function to flip the table from wide (481,631 rows x 185 columns) to long (88.1 million rows x 4 columns), so now the rows contain the entire time series of precipitation at each location.\u00a0 Note that the \u201cvariable\u201d column contains the file name of each input raster.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194382,"id":2194382,"title":"melt_1","filename":"melt_1-1.jpg","filesize":25580,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/melt_1-2","alt":"","author":"154341","description":"","caption":"","name":"melt_1-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:02:39","modified":"2023-12-12 17:02:39","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":479,"height":65,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1-213x65.jpg","thumbnail-width":213,"thumbnail-height":65,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","medium-width":464,"medium-height":63,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","medium_large-width":479,"medium_large-height":65,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","large-width":479,"large-height":65,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","1536x1536-width":479,"1536x1536-height":65,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","2048x2048-width":479,"2048x2048-height":65,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","card_image-width":479,"card_image-height":65,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1-1.jpg","wide_image-width":479,"wide_image-height":65}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2193292,"id":2193292,"title":"melt_1_table","filename":"melt_1_table.jpg","filesize":176136,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/melt_1_table","alt":"","author":"154341","description":"","caption":"","name":"melt_1_table","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 14:42:16","modified":"2023-12-14 12:39:17","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":554,"height":497,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","medium-width":291,"medium-height":261,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","medium_large-width":554,"medium_large-height":497,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","large-width":554,"large-height":497,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","1536x1536-width":554,"1536x1536-height":497,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","2048x2048-width":554,"2048x2048-height":497,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table-518x465.jpg","card_image-width":518,"card_image-height":465,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/melt_1_table.jpg","wide_image-width":554,"wide_image-height":497}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Eventually, we\u2019ll be calculating the seasonal averages of several precipitation variables, so we need a way to assign seasons to each data point.\u00a0 To do this, we\u2019ll use the <em><strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/base\/versions\/3.6.2\/topics\/substr\">substr<\/a><\/strong><\/em>\u00a0function to extract the date information based on index positions in the \u201cvariable\u201d column and insert it into a new column \u201cstr_date\u201d.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194402,"id":2194402,"title":"substring_split","filename":"substring_split-5.jpg","filesize":24723,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/substring_split-6","alt":"","author":"154341","description":"","caption":"","name":"substring_split-6","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:03:27","modified":"2023-12-12 17:03:27","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":846,"height":79,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5-213x79.jpg","thumbnail-width":213,"thumbnail-height":79,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","medium-width":464,"medium-height":43,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","medium_large-width":768,"medium_large-height":72,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","large-width":846,"large-height":79,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","1536x1536-width":846,"1536x1536-height":79,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","2048x2048-width":846,"2048x2048-height":79,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5-826x77.jpg","card_image-width":826,"card_image-height":77,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split-5.jpg","wide_image-width":846,"wide_image-height":79}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194452,"id":2194452,"title":"substring_split_table","filename":"substring_split_table-4.jpg","filesize":144163,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/substring_split_table-5","alt":"","author":"154341","description":"","caption":"","name":"substring_split_table-5","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:05:33","modified":"2023-12-12 17:05:33","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":872,"height":193,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4-213x193.jpg","thumbnail-width":213,"thumbnail-height":193,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","medium-width":464,"medium-height":103,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","medium_large-width":768,"medium_large-height":170,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","large-width":872,"large-height":193,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","1536x1536-width":872,"1536x1536-height":193,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","2048x2048-width":872,"2048x2048-height":193,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4-826x183.jpg","card_image-width":826,"card_image-height":183,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/substring_split_table-4.jpg","wide_image-width":872,"wide_image-height":193}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>We then use the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/dplyr\/versions\/0.5.0\/topics\/mutate\"><em>mutate<\/em><\/a><\/strong> function four separate times to engineer a series of new variables based on existing variables in the dataset.<\/p>\n<ol>\n<li>Convert the date information to month and year columns.<\/li>\n<\/ol>\n"},{"acf_fc_layout":"image","image":{"ID":2194462,"id":2194462,"title":"date_to_month_year","filename":"date_to_month_year-1.jpg","filesize":44207,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/date_to_month_year-2","alt":"","author":"154341","description":"","caption":"","name":"date_to_month_year-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:06:22","modified":"2023-12-12 17:06:22","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":785,"height":248,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","medium-width":464,"medium-height":147,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","medium_large-width":768,"medium_large-height":243,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","large-width":785,"large-height":248,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","1536x1536-width":785,"1536x1536-height":248,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","2048x2048-width":785,"2048x2048-height":248,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","card_image-width":785,"card_image-height":248,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/date_to_month_year-1.jpg","wide_image-width":785,"wide_image-height":248}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<ol start=\"2\">\n<li>Create a new \u201cseason\u201d column, which assigns the months of the year to seasons, following the pattern below:<\/li>\n<\/ol>\n<ul>\n<li style=\"list-style-type: none\">\n<ul>\n<li>Spring \u2013 March, April, May (3<sup>rd<\/sup>, 4<sup>th<\/sup>, 5<sup>th<\/sup> months of the year)<\/li>\n<li>Summer \u2013 June, July, August (6<sup>th<\/sup>, 7<sup>th<\/sup>, 8<sup>th<\/sup> months of the year)<\/li>\n<li>Fall \u2013 September, October, November (9<sup>th<\/sup>, 10<sup>th<\/sup>, 11<sup>th<\/sup> months of the year)<\/li>\n<li>Winter \u2013 December, January, February (12<sup>th<\/sup>, 1<sup>st<\/sup>, 2<sup>nd<\/sup> months of the year)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n"},{"acf_fc_layout":"image","image":{"ID":2194472,"id":2194472,"title":"season_columns","filename":"season_columns-1.jpg","filesize":42306,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/season_columns-2","alt":"","author":"154341","description":"","caption":"","name":"season_columns-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:07:02","modified":"2023-12-12 17:07:02","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":852,"height":179,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1-213x179.jpg","thumbnail-width":213,"thumbnail-height":179,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","medium-width":464,"medium-height":97,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","medium_large-width":768,"medium_large-height":161,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","large-width":852,"large-height":179,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","1536x1536-width":852,"1536x1536-height":179,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","2048x2048-width":852,"2048x2048-height":179,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1-826x174.jpg","card_image-width":826,"card_image-height":174,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_columns-1.jpg","wide_image-width":852,"wide_image-height":179}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<ol start=\"3\">\n<li>Concatenate the \u201cseason\u201d and \u201cyear\u201d columns, so that we can calculate our seasonal averages for each year.<\/li>\n<li>Create a new boolean column \u201cwet_day\u201d, which indicates whether each location experienced precipitation on a particular day. In this case, a location is considered a \u201cwet day\u201d and assigned a value of 1 if it received greater than or equal to 0.1 mm of precipitation in a day, and 0 if less. This indicator column will be used in subsequent steps to calculate the precipitation variables that measure variability across a season.<\/li>\n<\/ol>\n"},{"acf_fc_layout":"image","image":{"ID":2194482,"id":2194482,"title":"mutate_3and4","filename":"mutate_3and4-1.jpg","filesize":50209,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/mutate_3and4-2","alt":"","author":"154341","description":"","caption":"","name":"mutate_3and4-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:07:47","modified":"2023-12-12 17:07:47","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":957,"height":250,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","medium-width":464,"medium-height":121,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","medium_large-width":768,"medium_large-height":201,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","large-width":957,"large-height":250,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","1536x1536-width":957,"1536x1536-height":250,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","2048x2048-width":957,"2048x2048-height":250,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1-826x216.jpg","card_image-width":826,"card_image-height":216,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/mutate_3and4-1.jpg","wide_image-width":957,"wide_image-height":250}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>As noted in the section above, the winter season ranges from December 1 of year<em> y <\/em>to February 28 of year <em>y<\/em>+1.\u00a0 As such, we need to ensure that for every time this script runs, any data point in January or February is reassigned to the previous year.\u00a0 For example, January and February 1982 actually belong to the winter season of 1981.\u00a0 So here, if the \u201cseason\u201d column contains \u201cwinter\u201d, we update it with the previous year.\u00a0 Otherwise, the \u201cseason\u201d column remains the same.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194492,"id":2194492,"title":"reassign_year","filename":"reassign_year-1.jpg","filesize":51257,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/reassign_year-2","alt":"","author":"154341","description":"","caption":"","name":"reassign_year-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:09:10","modified":"2023-12-12 17:09:10","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1056,"height":127,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1-213x127.jpg","thumbnail-width":213,"thumbnail-height":127,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","medium-width":464,"medium-height":56,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","medium_large-width":768,"medium_large-height":92,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","large-width":1056,"large-height":127,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","1536x1536-width":1056,"1536x1536-height":127,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","2048x2048-width":1056,"2048x2048-height":127,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1-826x99.jpg","card_image-width":826,"card_image-height":99,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/reassign_year-1.jpg","wide_image-width":1056,"wide_image-height":127}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2193472,"id":2193472,"title":"season_calcs","filename":"season_calcs.jpg","filesize":314801,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/season_calcs","alt":"","author":"154341","description":"Data frame containing \"season_year\" and \"wet_day\" columns.  ","caption":"Data frame showing the results of the previous four steps.  Note the new date columns, as well as those for \"season\", \"season_year\", and \"wet_day\".","name":"season_calcs","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 15:14:17","modified":"2023-12-14 12:39:35","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1169,"height":493,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","medium-width":464,"medium-height":196,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","medium_large-width":768,"medium_large-height":324,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","large-width":1169,"large-height":493,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","1536x1536-width":1169,"1536x1536-height":493,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","2048x2048-width":1169,"2048x2048-height":493,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs-826x348.jpg","card_image-width":826,"card_image-height":348,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/season_calcs.jpg","wide_image-width":1169,"wide_image-height":493}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<h2>Data engineering: Creating the seasonal precipitation variables<\/h2>\n<p>The next step and perhaps the most important one in the workflow is creating the seasonal precipitation variables.\u00a0 For this, we\u2019ll rely on the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/stats\/versions\/3.6.2\/topics\/aggregate\"><em>aggregate<\/em><\/a><\/strong> function three separate times to create three new data frames.\u00a0 This function is very useful for calculating summary statistics on subsets of data.<\/p>\n<p>First, however, we need to use <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/dplyr\/versions\/0.5.0\/topics\/mutate\"><em>mutate<\/em><\/a><\/strong> one more time to concatenate the x\/y columns into a new \u201ccoordinates\u201d column to ensure that each location has a unique identifier.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194502,"id":2194502,"title":"concat_coords","filename":"concat_coords-1.jpg","filesize":19900,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/concat_coords-2","alt":"","author":"154341","description":"","caption":"","name":"concat_coords-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:09:48","modified":"2023-12-12 17:09:48","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":605,"height":56,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1-213x56.jpg","thumbnail-width":213,"thumbnail-height":56,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","medium-width":464,"medium-height":43,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","medium_large-width":605,"medium_large-height":56,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","large-width":605,"large-height":56,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","1536x1536-width":605,"1536x1536-height":56,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","2048x2048-width":605,"2048x2048-height":56,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","card_image-width":605,"card_image-height":56,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/concat_coords-1.jpg","wide_image-width":605,"wide_image-height":56}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194552,"id":2194552,"title":"head_coord_concat","filename":"head_coord_concat-4.jpg","filesize":312748,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/head_coord_concat-5","alt":"","author":"154341","description":"","caption":"","name":"head_coord_concat-5","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:13:24","modified":"2023-12-12 17:13:24","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1477,"height":364,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","medium-width":464,"medium-height":114,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","medium_large-width":768,"medium_large-height":189,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","large-width":1477,"large-height":364,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","1536x1536-width":1477,"1536x1536-height":364,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","2048x2048-width":1477,"2048x2048-height":364,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4-826x204.jpg","card_image-width":826,"card_image-height":204,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/head_coord_concat-4.jpg","wide_image-width":1477,"wide_image-height":364}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>The first aggregation calculates the total precipitation (\u201cvalue\u201d) and total precipitation days (\u201cwet_day\u201d) within a season (\u201cseason_year\u201d) for each location (\u201ccoordinates\u201d) in the dataset.\u00a0 The <em>FUN<\/em> argument specifies the summary statistic that is applied to each subset (location\/season pair), which in this case is <a href=\"https:\/\/www.rdocumentation.org\/packages\/base\/versions\/3.6.2\/topics\/sum\"><em><strong>sum<\/strong><\/em><\/a> to get the total precipitation (mm) and total days with precipitation in a season.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194562,"id":2194562,"title":"agg1","filename":"agg1-1.jpg","filesize":53358,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/agg1-2","alt":"","author":"154341","description":"","caption":"","name":"agg1-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:14:07","modified":"2023-12-12 17:14:07","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1261,"height":124,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1-213x124.jpg","thumbnail-width":213,"thumbnail-height":124,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","medium-width":464,"medium-height":46,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","medium_large-width":768,"medium_large-height":76,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","large-width":1261,"large-height":124,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","1536x1536-width":1261,"1536x1536-height":124,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","2048x2048-width":1261,"2048x2048-height":124,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1-826x81.jpg","card_image-width":826,"card_image-height":81,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1-1.jpg","wide_image-width":1261,"wide_image-height":124}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194572,"id":2194572,"title":"agg1_results","filename":"agg1_results-1.jpg","filesize":143829,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/agg1_results-2","alt":"","author":"154341","description":"","caption":"","name":"agg1_results-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:14:43","modified":"2023-12-12 17:14:43","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":820,"height":240,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","medium-width":464,"medium-height":136,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","medium_large-width":768,"medium_large-height":225,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","large-width":820,"large-height":240,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","1536x1536-width":820,"1536x1536-height":240,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","2048x2048-width":820,"2048x2048-height":240,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","card_image-width":820,"card_image-height":240,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/agg1_results-1.jpg","wide_image-width":820,"wide_image-height":240}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>The second aggregation calculates the Gini Coefficient of precipitation within a season for each location.\u00a0 Note here that \u201cwet_day\u201d is included in the aggregation because the Gini Coefficient calculation only includes days when precipitation occurred.\u00a0 The <em>FUN<\/em> argument here calls the <strong><a href=\"https:\/\/rdrr.io\/cran\/reldist\/man\/gini.html\"><em>gini<\/em><\/a><\/strong> function from the {reldist} R package.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194582,"id":2194582,"title":"gini1","filename":"gini1-1.jpg","filesize":46324,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/gini1-2","alt":"","author":"154341","description":"","caption":"","name":"gini1-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:15:25","modified":"2023-12-12 17:15:25","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1089,"height":128,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1-213x128.jpg","thumbnail-width":213,"thumbnail-height":128,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","medium-width":464,"medium-height":55,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","medium_large-width":768,"medium_large-height":90,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","large-width":1089,"large-height":128,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","1536x1536-width":1089,"1536x1536-height":128,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","2048x2048-width":1089,"2048x2048-height":128,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1-826x97.jpg","card_image-width":826,"card_image-height":97,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1-1.jpg","wide_image-width":1089,"wide_image-height":128}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194592,"id":2194592,"title":"gini1_results","filename":"gini1_results-1.jpg","filesize":129283,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/gini1_results-2","alt":"","author":"154341","description":"","caption":"","name":"gini1_results-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:16:10","modified":"2023-12-12 17:16:10","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":787,"height":216,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","medium-width":464,"medium-height":127,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","medium_large-width":768,"medium_large-height":211,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","large-width":787,"large-height":216,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","1536x1536-width":787,"1536x1536-height":216,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","2048x2048-width":787,"2048x2048-height":216,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","card_image-width":787,"card_image-height":216,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini1_results-1.jpg","wide_image-width":787,"wide_image-height":216}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>The third aggregation calculates the Lorenz Asymmetry Coefficient of precipitation within a season for each location.\u00a0 Like the Gini, \u201cwet_day\u201d is included in the aggregation because the Lorenz Asymmetry Coefficient calculation also only includes days when precipitation occurred.\u00a0 The <em>FUN<\/em> argument here calls the <strong><a href=\"https:\/\/rdrr.io\/cran\/ineq\/man\/Lasym.html\"><em>Lasym<\/em><\/a><\/strong> function from the {ineq} R package.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194602,"id":2194602,"title":"lorenz1","filename":"lorenz1-1.jpg","filesize":56073,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/lorenz1-2","alt":"","author":"154341","description":"","caption":"","name":"lorenz1-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:16:57","modified":"2023-12-12 17:16:57","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1217,"height":125,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1-213x125.jpg","thumbnail-width":213,"thumbnail-height":125,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","medium-width":464,"medium-height":48,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","medium_large-width":768,"medium_large-height":79,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","large-width":1217,"large-height":125,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","1536x1536-width":1217,"1536x1536-height":125,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","2048x2048-width":1217,"2048x2048-height":125,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1-826x85.jpg","card_image-width":826,"card_image-height":85,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1-1.jpg","wide_image-width":1217,"wide_image-height":125}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194612,"id":2194612,"title":"lorenz1_results","filename":"lorenz1_results-1.jpg","filesize":125623,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/lorenz1_results-2","alt":"","author":"154341","description":"","caption":"","name":"lorenz1_results-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:17:43","modified":"2023-12-12 17:17:43","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":783,"height":215,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","medium-width":464,"medium-height":127,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","medium_large-width":768,"medium_large-height":211,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","large-width":783,"large-height":215,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","1536x1536-width":783,"1536x1536-height":215,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","2048x2048-width":783,"2048x2048-height":215,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","card_image-width":783,"card_image-height":215,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/lorenz1_results-1.jpg","wide_image-width":783,"wide_image-height":215}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>After a few more lines of code to rename columns to make them more understandable, we now have three separate data frames containing our calculated precipitation variables.\u00a0 Because we know that the Gini and Lorenz Asymmetry Coefficients can only be calculated on days when precipitation occurred, we use the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/dplyr\/versions\/0.7.8\/topics\/filter\"><em>filter<\/em><\/a><\/strong> function to return rows that meet a condition, in this case \u201cwet_day\u201d = 1.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194632,"id":2194632,"title":"filtered_tables","filename":"filtered_tables-1.jpg","filesize":48169,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/filtered_tables-2","alt":"","author":"154341","description":"","caption":"","name":"filtered_tables-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:19:44","modified":"2023-12-12 17:19:44","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":828,"height":223,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","medium-width":464,"medium-height":125,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","medium_large-width":768,"medium_large-height":207,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","large-width":828,"large-height":223,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","1536x1536-width":828,"1536x1536-height":223,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","2048x2048-width":828,"2048x2048-height":223,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1-826x222.jpg","card_image-width":826,"card_image-height":222,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/filtered_tables-1.jpg","wide_image-width":828,"wide_image-height":223}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194642,"id":2194642,"title":"gini_lorenz_filtered","filename":"gini_lorenz_filtered-1.jpg","filesize":298791,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/gini_lorenz_filtered-2","alt":"","author":"154341","description":"","caption":"","name":"gini_lorenz_filtered-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:20:30","modified":"2023-12-12 17:20:30","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":911,"height":505,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","medium-width":464,"medium-height":257,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","medium_large-width":768,"medium_large-height":426,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","large-width":911,"large-height":505,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","1536x1536-width":911,"1536x1536-height":505,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","2048x2048-width":911,"2048x2048-height":505,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1-826x458.jpg","card_image-width":826,"card_image-height":458,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/gini_lorenz_filtered-1.jpg","wide_image-width":911,"wide_image-height":505}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194652,"id":2194652,"title":"all_dimensions","filename":"all_dimensions-1.jpg","filesize":28076,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/all_dimensions-2","alt":"","author":"154341","description":"","caption":"","name":"all_dimensions-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:21:07","modified":"2023-12-12 17:21:07","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":342,"height":139,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1-213x139.jpg","thumbnail-width":213,"thumbnail-height":139,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","medium-width":342,"medium-height":139,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","medium_large-width":342,"medium_large-height":139,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","large-width":342,"large-height":139,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","1536x1536-width":342,"1536x1536-height":139,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","2048x2048-width":342,"2048x2048-height":139,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","card_image-width":342,"card_image-height":139,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/all_dimensions-1.jpg","wide_image-width":342,"wide_image-height":139}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"sidebar","content":"<p><strong>Note:\u00a0<\/strong><\/p>\n<p>In the Gini and Lorenz tables, there are 956,125 records out of a possible 963,262 (481,631 locations x 2 seasons = 963,262).\u00a0 This is because 7,065 locations in Summer 2010 and 72 locations in Fall 2010 <em>experienced zero precipitation<\/em> <em>throughout the entire season<\/em>.\u00a0 This means that a total of 7,137 locations over the two seasons will have no Gini or Lorenz Coefficient because there was no precipitation variability to calculate within that season.\u00a0 Of course, these numbers will likely be different for each data batch due to season-to-season precipitation variation, but it is important to understand that the Gini and Lorenz record counts may be lower than the expected row counts based on certain locations receiving zero total precipitation within a season.<\/p>\n","image_reference":false,"layout":"standard","image_reference_figure":"","snippet":"","spotlight_name":"","section_title":"","position":"Center","spotlight_image":false},{"acf_fc_layout":"content","content":"<p>We\u2019ll then join these three separate data frames together using the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/base\/versions\/3.6.2\/topics\/merge\"><em>merge<\/em><\/a><\/strong> function two times in a row.\u00a0 First, we join the Gini and Lorenz data frames together, then we join that result to the data frame containing the aggregated total precipitation and precipitation days attributes.\u00a0 Note that in both of these joins, the common attribute (key field) is actually a combination of the \u201ccoordinates\u201d and \u201cseason_year\u201d columns, because there are precipitation variables for two seasons for each location.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194662,"id":2194662,"title":"join_screenshot","filename":"join_screenshot-1.jpg","filesize":100199,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/join_screenshot-2","alt":"","author":"154341","description":"","caption":"","name":"join_screenshot-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:21:48","modified":"2023-12-12 17:21:48","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1209,"height":325,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","medium-width":464,"medium-height":125,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","medium_large-width":768,"medium_large-height":206,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","large-width":1209,"large-height":325,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","1536x1536-width":1209,"1536x1536-height":325,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","2048x2048-width":1209,"2048x2048-height":325,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1-826x222.jpg","card_image-width":826,"card_image-height":222,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/join_screenshot-1.jpg","wide_image-width":1209,"wide_image-height":325}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2193882,"id":2193882,"title":"final_join_table","filename":"final_join_table-1.jpg","filesize":246732,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/final_join_table-2","alt":"","author":"154341","description":"","caption":"Data frame showing the four precipitation variables calculated for two seasons (Summer and Fall of 2010).","name":"final_join_table-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 15:49:44","modified":"2023-12-13 21:59:23","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1069,"height":493,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","medium-width":464,"medium-height":214,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","medium_large-width":768,"medium_large-height":354,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","large-width":1069,"large-height":493,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","1536x1536-width":1069,"1536x1536-height":493,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","2048x2048-width":1069,"2048x2048-height":493,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1-826x381.jpg","card_image-width":826,"card_image-height":381,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_join_table-1.jpg","wide_image-width":1069,"wide_image-height":493}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Because we will eventually want to map and perform spatial analysis on this data within ArcGIS, we\u2019ll use the <strong><a href=\"https:\/\/www.rdocumentation.org\/packages\/tidyr\/versions\/1.3.0\/topics\/separate\"><em>separate<\/em><\/a><\/strong> function to split the \u201ccoordinates\u201d column into individual columns representing the x- and y-coordinate for each location.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194682,"id":2194682,"title":"coord_separate","filename":"coord_separate-1.jpg","filesize":30626,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/coord_separate-2","alt":"","author":"154341","description":"","caption":"","name":"coord_separate-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:22:27","modified":"2023-12-12 17:22:27","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":829,"height":122,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1-213x122.jpg","thumbnail-width":213,"thumbnail-height":122,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","medium-width":464,"medium-height":68,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","medium_large-width":768,"medium_large-height":113,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","large-width":829,"large-height":122,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","1536x1536-width":829,"1536x1536-height":122,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","2048x2048-width":829,"2048x2048-height":122,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1-826x122.jpg","card_image-width":826,"card_image-height":122,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/coord_separate-1.jpg","wide_image-width":829,"wide_image-height":122}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194692,"id":2194692,"title":"final_table_all","filename":"final_table_all-2.jpg","filesize":224044,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/final_table_all-3","alt":"","author":"154341","description":"","caption":"","name":"final_table_all-3","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:23:40","modified":"2023-12-12 17:23:40","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1262,"height":197,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2-213x197.jpg","thumbnail-width":213,"thumbnail-height":197,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","medium-width":464,"medium-height":72,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","medium_large-width":768,"medium_large-height":120,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","large-width":1262,"large-height":197,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","1536x1536-width":1262,"1536x1536-height":197,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","2048x2048-width":1262,"2048x2048-height":197,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2-826x129.jpg","card_image-width":826,"card_image-height":129,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/final_table_all-2.jpg","wide_image-width":1262,"wide_image-height":197}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Last, we\u2019ll write out the final joined precipitation dataset as a CSV file that can then be used in ArcGIS, or in other data science and analytics software.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194712,"id":2194712,"title":"write_csv","filename":"write_csv-1.jpg","filesize":21797,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/write_csv-2","alt":"","author":"154341","description":"","caption":"","name":"write_csv-2","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:24:15","modified":"2023-12-12 17:24:15","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":716,"height":66,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1-213x66.jpg","thumbnail-width":213,"thumbnail-height":66,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","medium-width":464,"medium-height":43,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","medium_large-width":716,"medium_large-height":66,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","large-width":716,"large-height":66,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","1536x1536-width":716,"1536x1536-height":66,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","2048x2048-width":716,"2048x2048-height":66,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","card_image-width":716,"card_image-height":66,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/write_csv-1.jpg","wide_image-width":716,"wide_image-height":66}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<h2>Final thoughts<\/h2>\n<p>Remember, the entire workflow described above only focused on two seasons (Summer and Fall, 2010).\u00a0 As mentioned above, it was necessary to perform this workflow in batches not only because of computer processing limitations, but more importantly because the precipitation variables are seasonal calculations based on daily precipitation data.\u00a0\u00a0 That means that within the 30-year time series covered by this study, there were a total of 120 seasons.\u00a0 Because I was able to process the summer and fall seasons together, this amounted to 90 batches that I had to run manually.\u00a0 Fortunately, I was able to automate much of this process using R, including the downloading of each batch of rasters, all the cleaning\/wrangling\/engineering steps, writing out of the final CSV, and deletion of each batch of rasters.\u00a0 The only thing I had to change with each batch was the start and end date of the batch.\u00a0 The final result was a folder containing 90 CSV files, one for each batch.\u00a0 Each CSV file contains the final calculated precipitation variables for each location, for a given season.<\/p>\n"},{"acf_fc_layout":"image","image":{"ID":2194722,"id":2194722,"title":"CSVs_folder","filename":"CSVs_folder-2.jpg","filesize":407518,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/csvs_folder-3","alt":"","author":"154341","description":"","caption":"Folder containing the 90 CSV files representing the 90 batches (120 seasons) from 1981-2010.","name":"csvs_folder-3","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:25:14","modified":"2023-12-12 17:25:20","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1624,"height":803,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","medium-width":464,"medium-height":229,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","medium_large-width":768,"medium_large-height":380,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","large-width":1624,"large-height":803,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2-1536x759.jpg","1536x1536-width":1536,"1536x1536-height":759,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","2048x2048-width":1624,"2048x2048-height":803,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2-826x408.jpg","card_image-width":826,"card_image-height":408,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/CSVs_folder-2.jpg","wide_image-width":1624,"wide_image-height":803}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"image","image":{"ID":2194732,"id":2194732,"title":"csv_example_2","filename":"csv_example_2-2.jpg","filesize":503764,"url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/csv_example_2-3","alt":"","author":"154341","description":"","caption":"Example CSV from the summer\/fall 2010 batch with four precipitation variable calculations: \"precip\", \"frequency\", \"gini_coef\", and \"lorenz_coef\".","name":"csv_example_2-3","status":"inherit","uploaded_to":2190472,"date":"2023-12-12 17:26:00","modified":"2023-12-12 17:26:06","menu_order":0,"mime_type":"image\/jpeg","type":"image","subtype":"jpeg","icon":"https:\/\/www.esri.com\/arcgis-blog\/wp-includes\/images\/media\/default.png","width":1787,"height":558,"sizes":{"thumbnail":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2-213x200.jpg","thumbnail-width":213,"thumbnail-height":200,"medium":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","medium-width":464,"medium-height":145,"medium_large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","medium_large-width":768,"medium_large-height":240,"large":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","large-width":1787,"large-height":558,"1536x1536":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2-1536x480.jpg","1536x1536-width":1536,"1536x1536-height":480,"2048x2048":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","2048x2048-width":1787,"2048x2048-height":558,"card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2-826x258.jpg","card_image-width":826,"card_image-height":258,"wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/csv_example_2-2.jpg","wide_image-width":1787,"wide_image-height":558}},"image_position":"center","orientation":"horizontal","hyperlink":""},{"acf_fc_layout":"content","content":"<p>Here are some rough estimates of the amount of time it took to perform this workflow on all 90 batches.<\/p>\n<ul>\n<li>Winter season processing (89 or 90 daily rasters) = ~24 minutes x 30 batches = ~12 hours<\/li>\n<li>Spring season processing (91 daily rasters) = ~25 minutes x 30 batches = ~12.5 hours<\/li>\n<li>Summer\/Fall season processing (182 daily rasters) = ~45 minutes x 30 batches = ~22.5 hours<\/li>\n<\/ul>\n<p>So all told, it took nearly 50 hours of running R code to get to the point where I have this folder of CSV files (and that doesn\u2019t include actually writing the R code to do it).\u00a0 Ever hear that phrase \u201cdata engineering takes up 80% of your time in a data science project&#8221;??<\/p>\n<p>Guess what\u2026we aren\u2019t even close to being done.\u00a0 Jump to the <a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-3-data-preparation-and-data-engineering-using-python\/\">next blog article<\/a> to learn how we\u2019ll process all of these CSV files into our final, analysis-ready dataset.<\/p>\n"},{"acf_fc_layout":"sidebar","content":"<h2 style=\"text-align: left\">Spatial data science with R, Python, and ArcGIS<\/h2>\n<p>Here are the links to all the articles of the series:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-1-clustering-us-precipitation-regions\/\">Part 1<\/a>. Clustering US Precipitation Regions<\/li>\n<li><a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\/\">Part 2<\/a>. Data preparation and data engineering using R<\/li>\n<li><a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-3-data-preparation-and-data-engineering-using-python\/\">Part 3<\/a>. Data preparation and data engineering using Python<\/li>\n<li><a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-4-data-preparation-using-spatial-analysis-and-automation-in-arcgis\/\" target=\"_blank\" rel=\"noopener\">Part 4<\/a>. Data preparation using spatial analysis and automation in ArcGIS<\/li>\n<li><a href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-5-machine-learning-cluster-analysis-in-python-and-arcgis\">Part 5<\/a>. Machine Learning: Cluster analysis using Python and ArcGIS<\/li>\n<\/ul>\n","image_reference":false,"layout":"standard","image_reference_figure":"","snippet":"","spotlight_name":"","section_title":"","position":"Center","spotlight_image":false}],"related_articles":"","card_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/cluster_map_resized.jpg","wide_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/AdobeStock_96810852_fixed-1.png"},"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>End-to-end spatial data science 2: Data preparation and data engineering using R<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"End-to-end spatial data science 2: Data preparation and data engineering using R\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\" \/>\n<meta property=\"og:site_name\" content=\"ArcGIS Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/esrigis\/\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-25T16:54:09+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@ESRI\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"15 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\"},\"author\":{\"name\":\"Nicholas Giner\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/2dc4741deea59d3274cfa775e52501b2\"},\"headline\":\"End-to-end spatial data science 2: Data preparation and data engineering using R\",\"datePublished\":\"2023-12-14T18:00:15+00:00\",\"dateModified\":\"2024-10-25T16:54:09+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\"},\"wordCount\":11,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"keywords\":[\"Data Engineering\",\"machine learning\",\"python\",\"r\",\"spatial data science\"],\"articleSection\":[\"Analytics\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\",\"name\":\"End-to-end spatial data science 2: Data preparation and data engineering using R\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\"},\"datePublished\":\"2023-12-14T18:00:15+00:00\",\"dateModified\":\"2024-10-25T16:54:09+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.esri.com\/arcgis-blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"End-to-end spatial data science 2: Data preparation and data engineering using R\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"name\":\"ArcGIS Blog\",\"description\":\"Get insider info from Esri product teams\",\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\",\"name\":\"Esri\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"width\":400,\"height\":400,\"caption\":\"Esri\"},\"image\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/esrigis\/\",\"https:\/\/x.com\/ESRI\",\"https:\/\/www.linkedin.com\/company\/5311\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/2dc4741deea59d3274cfa775e52501b2\",\"name\":\"Nicholas Giner\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2021\/01\/headshot-e1610030307989-213x200.jpeg\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2021\/01\/headshot-e1610030307989-213x200.jpeg\",\"caption\":\"Nicholas Giner\"},\"description\":\"Nick Giner is a Product Manager for Spatial Analysis and Data Science. Prior to joining Esri in 2014, he completed Bachelor\u2019s and PhD degrees in Geography from Penn State University and Clark University, respectively. In his spare time, he likes to play guitar, golf, cook, cut the grass, and read\/watch shows about history.\",\"sameAs\":[\"www.linkedin.com\/in\/nicholas-giner-0282966b\",\"https:\/\/x.com\/NickGiner\"],\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/author\/nginer\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"End-to-end spatial data science 2: Data preparation and data engineering using R","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","og_locale":"en_US","og_type":"article","og_title":"End-to-end spatial data science 2: Data preparation and data engineering using R","og_url":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","og_site_name":"ArcGIS Blog","article_publisher":"https:\/\/www.facebook.com\/esrigis\/","article_modified_time":"2024-10-25T16:54:09+00:00","twitter_card":"summary_large_image","twitter_site":"@ESRI","twitter_misc":{"Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#article","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r"},"author":{"name":"Nicholas Giner","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/2dc4741deea59d3274cfa775e52501b2"},"headline":"End-to-end spatial data science 2: Data preparation and data engineering using R","datePublished":"2023-12-14T18:00:15+00:00","dateModified":"2024-10-25T16:54:09+00:00","mainEntityOfPage":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r"},"wordCount":11,"commentCount":0,"publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"keywords":["Data Engineering","machine learning","python","r","spatial data science"],"articleSection":["Analytics"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","url":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r","name":"End-to-end spatial data science 2: Data preparation and data engineering using R","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#website"},"datePublished":"2023-12-14T18:00:15+00:00","dateModified":"2024-10-25T16:54:09+00:00","breadcrumb":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/arcgis-pro\/analytics\/end-to-end-spatial-data-science-2-data-preparation-and-data-engineering-using-r#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.esri.com\/arcgis-blog\/"},{"@type":"ListItem","position":2,"name":"End-to-end spatial data science 2: Data preparation and data engineering using R"}]},{"@type":"WebSite","@id":"https:\/\/www.esri.com\/arcgis-blog\/#website","url":"https:\/\/www.esri.com\/arcgis-blog\/","name":"ArcGIS Blog","description":"Get insider info from Esri product teams","publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization","name":"Esri","url":"https:\/\/www.esri.com\/arcgis-blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","width":400,"height":400,"caption":"Esri"},"image":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/esrigis\/","https:\/\/x.com\/ESRI","https:\/\/www.linkedin.com\/company\/5311\/"]},{"@type":"Person","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/2dc4741deea59d3274cfa775e52501b2","name":"Nicholas Giner","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2021\/01\/headshot-e1610030307989-213x200.jpeg","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2021\/01\/headshot-e1610030307989-213x200.jpeg","caption":"Nicholas Giner"},"description":"Nick Giner is a Product Manager for Spatial Analysis and Data Science. Prior to joining Esri in 2014, he completed Bachelor\u2019s and PhD degrees in Geography from Penn State University and Clark University, respectively. In his spare time, he likes to play guitar, golf, cook, cut the grass, and read\/watch shows about history.","sameAs":["www.linkedin.com\/in\/nicholas-giner-0282966b","https:\/\/x.com\/NickGiner"],"url":"https:\/\/www.esri.com\/arcgis-blog\/author\/nginer"}]}},"text_date":"December 14, 2023","author_name":"Nicholas Giner","author_page":"https:\/\/www.esri.com\/arcgis-blog\/author\/nginer","custom_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2023\/12\/AdobeStock_96810852_fixed-1.png","primary_product":"ArcGIS Pro","tag_data":[{"term_id":760452,"name":"Data Engineering","slug":"data-engineering","term_group":0,"term_taxonomy_id":760452,"taxonomy":"post_tag","description":"","parent":0,"count":34,"filter":"raw"},{"term_id":35661,"name":"machine learning","slug":"machine-learning","term_group":0,"term_taxonomy_id":35661,"taxonomy":"post_tag","description":"","parent":0,"count":41,"filter":"raw"},{"term_id":24341,"name":"python","slug":"python","term_group":0,"term_taxonomy_id":24341,"taxonomy":"post_tag","description":"","parent":0,"count":171,"filter":"raw"},{"term_id":30241,"name":"r","slug":"r","term_group":0,"term_taxonomy_id":30241,"taxonomy":"post_tag","description":"","parent":0,"count":19,"filter":"raw"},{"term_id":759592,"name":"spatial data science","slug":"spatial-data-science","term_group":0,"term_taxonomy_id":759592,"taxonomy":"post_tag","description":"","parent":0,"count":17,"filter":"raw"}],"category_data":[{"term_id":23341,"name":"Analytics","slug":"analytics","term_group":0,"term_taxonomy_id":23341,"taxonomy":"category","description":"","parent":0,"count":1328,"filter":"raw"}],"product_data":[{"term_id":36841,"name":"ArcGIS API for Python","slug":"api-python","term_group":0,"term_taxonomy_id":36841,"taxonomy":"product","description":"","parent":36601,"count":151,"filter":"raw"},{"term_id":36561,"name":"ArcGIS Pro","slug":"arcgis-pro","term_group":0,"term_taxonomy_id":36561,"taxonomy":"product","description":"","parent":0,"count":2036,"filter":"raw"}],"primary_product_link":"https:\/\/www.esri.com\/arcgis-blog\/?s=#&products=arcgis-pro","_links":{"self":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/2190472","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/users\/154341"}],"replies":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/comments?post=2190472"}],"version-history":[{"count":0,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/2190472\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/media?parent=2190472"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/categories?post=2190472"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/tags?post=2190472"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/industry?post=2190472"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/product?post=2190472"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}