{"id":66991,"date":"2015-03-25T15:01:47","date_gmt":"2015-03-25T15:01:47","guid":{"rendered":"http:\/\/www.esri.com\/arcgis-blog\/products\/product\/uncategorized\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\/"},"modified":"2018-03-26T21:04:52","modified_gmt":"2018-03-26T21:04:52","slug":"new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","status":"publish","type":"blog","link":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","title":{"rendered":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop"},"author":3981,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","format":"standard","meta":{"_acf_changed":false,"_searchwp_excluded":""},"categories":[23851],"tags":[25351,25361,25371,25381,25391],"industry":[],"product":[],"class_list":["post-66991","blog","type-blog","status-publish","format-standard","hentry","category-data-management","tag-big-data","tag-binning","tag-geodata","tag-geodatabase","tag-hadoop"],"acf":{"short_description":"The Big Data team is excited to offer a new tutorial on spatial aggregation (sometimes called spatial binning). Spatial aggregation is ex...","flexible_content":[{"acf_fc_layout":"content","content":"<p>The Big Data team is excited to offer a new tutorial on spatial aggregation (sometimes called spatial binning). Spatial aggregation is extremely useful in summarizing big data to gain a meaningful snapshot of patterns in your data. Spatial aggregation works by creating square bins of a user specified size, like this:<\/p>\n<p><!--more--><\/p>\n<p><a href=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-47651\" src=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/1-1024x542.png\" alt=\"\" width=\"640\" height=\"338\" \/><\/a><\/p>\n<p>The result looks similar to a raster, although they are really square polygons. Here is an example of what the results look like. If you wanted to draw 170 billion points in ArcMap, it would look something like this:<\/p>\n<p><a href=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/2.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-47652\" src=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/2-1024x818.jpg\" alt=\"\" width=\"640\" height=\"511\" \/><\/a><\/p>\n<p>It\u2019s pretty hard to gain any information from this map. You can kind of see that most points are close to NYC, but that\u2019s about it. This dataset is actually NYC taxi data that is <a href=\"http:\/\/www.andresmh.com\/nyctaxitrips\/\">freely available online<\/a>. While it\u2019s a very interesting dataset, it can be hard to deal with such mass quantities of data.<\/p>\n<p>To see underlying patterns you can aggregate these data into spatial bins. By just looking at the count of points in a bin, we begin to see patterns emerge, and figure out interesting questions to ask from our data. Here is what the aggregated data looks like:<\/p>\n<p><a href=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/3.png.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-47653\" src=\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2015\/03\/3.png.jpg\" alt=\"\" width=\"912\" height=\"859\" \/><\/a><\/p>\n<p>The darker areas represent higher counts of taxi drop-offs (particularly in Manhattan, and both airports). Spatial aggregation allows us to summarize the dataset, make it more visually appealing, informative, and allows us to begin asking more questions.<\/p>\n<p>This spatial analysis ability is available using the <a href=\"http:\/\/esri.github.io\/gis-tools-for-hadoop\/\">GIS Tools for Hadoop.<\/a> There is a new tutorial posted that takes you through the steps of aggregating taxi data. <a href=\"https:\/\/github.com\/Esri\/gis-tools-for-hadoop\/wiki\/Aggregating-CSV-Data-%28Spatial-Binning%29\">You can find the tutorial here<\/a>.<\/p>\n<p>Happy Data Mining!<\/p>\n<p><em>(Post submitted by Sarah Ambrose, Big Data Team)<\/em><\/p>\n"}],"authors":[{"ID":3981,"user_firstname":"Jonathan","user_lastname":"Murphy","nickname":"Jonathan Murphy","user_nicename":"jonmurphy","display_name":"Jonathan Murphy","user_email":"jonathan_murphy@esri.com","user_url":"","user_registered":"2018-03-02 00:15:37","user_description":"Product Owner, UX Designer and Content Strategist on the Geodatabase team at Esri. \r\nWriter, musician, cockatiel whisperer and prolific world traveler.","user_avatar":"<img data-del=\"avatar\" src='https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2020\/04\/J_Mu-213x200.png' class='avatar pp-user-avatar avatar-96 photo ' height='96' width='96'\/>"}]},"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>New Spatial Aggregation Tutorial for GIS Tools for Hadoop<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"New Spatial Aggregation Tutorial for GIS Tools for Hadoop\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\" \/>\n<meta property=\"og:site_name\" content=\"ArcGIS Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/esrigis\/\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-26T21:04:52+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@ESRI\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\"},\"author\":{\"name\":\"Jonathan Murphy\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/dec789ad68db472c6018c1c9068998be\"},\"headline\":\"New Spatial Aggregation Tutorial for GIS Tools for Hadoop\",\"datePublished\":\"2015-03-25T15:01:47+00:00\",\"dateModified\":\"2018-03-26T21:04:52+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\"},\"wordCount\":9,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"keywords\":[\"Big Data\",\"binning\",\"geodata\",\"geodatabase\",\"Hadoop\"],\"articleSection\":[\"Data Management\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\",\"name\":\"New Spatial Aggregation Tutorial for GIS Tools for Hadoop\",\"isPartOf\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\"},\"datePublished\":\"2015-03-25T15:01:47+00:00\",\"dateModified\":\"2018-03-26T21:04:52+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.esri.com\/arcgis-blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"New Spatial Aggregation Tutorial for GIS Tools for Hadoop\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#website\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"name\":\"ArcGIS Blog\",\"description\":\"Get insider info from Esri product teams\",\"publisher\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#organization\",\"name\":\"Esri\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png\",\"width\":400,\"height\":400,\"caption\":\"Esri\"},\"image\":{\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/esrigis\/\",\"https:\/\/x.com\/ESRI\",\"https:\/\/www.linkedin.com\/company\/5311\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/dec789ad68db472c6018c1c9068998be\",\"name\":\"Jonathan Murphy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2020\/04\/J_Mu-213x200.png\",\"contentUrl\":\"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2020\/04\/J_Mu-213x200.png\",\"caption\":\"Jonathan Murphy\"},\"description\":\"Product Owner, UX Designer and Content Strategist on the Geodatabase team at Esri. Writer, musician, cockatiel whisperer and prolific world traveler.\",\"url\":\"https:\/\/www.esri.com\/arcgis-blog\/author\/jonmurphy\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","og_locale":"en_US","og_type":"article","og_title":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop","og_url":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","og_site_name":"ArcGIS Blog","article_publisher":"https:\/\/www.facebook.com\/esrigis\/","article_modified_time":"2018-03-26T21:04:52+00:00","twitter_card":"summary_large_image","twitter_site":"@ESRI","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#article","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop"},"author":{"name":"Jonathan Murphy","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/dec789ad68db472c6018c1c9068998be"},"headline":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop","datePublished":"2015-03-25T15:01:47+00:00","dateModified":"2018-03-26T21:04:52+00:00","mainEntityOfPage":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop"},"wordCount":9,"commentCount":0,"publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"keywords":["Big Data","binning","geodata","geodatabase","Hadoop"],"articleSection":["Data Management"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","url":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop","name":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop","isPartOf":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#website"},"datePublished":"2015-03-25T15:01:47+00:00","dateModified":"2018-03-26T21:04:52+00:00","breadcrumb":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.esri.com\/arcgis-blog\/products\/product\/data-management\/new-spatial-aggregation-tutorial-for-gis-tools-for-hadoop#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.esri.com\/arcgis-blog\/"},{"@type":"ListItem","position":2,"name":"New Spatial Aggregation Tutorial for GIS Tools for Hadoop"}]},{"@type":"WebSite","@id":"https:\/\/www.esri.com\/arcgis-blog\/#website","url":"https:\/\/www.esri.com\/arcgis-blog\/","name":"ArcGIS Blog","description":"Get insider info from Esri product teams","publisher":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.esri.com\/arcgis-blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.esri.com\/arcgis-blog\/#organization","name":"Esri","url":"https:\/\/www.esri.com\/arcgis-blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2018\/04\/Esri.png","width":400,"height":400,"caption":"Esri"},"image":{"@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/esrigis\/","https:\/\/x.com\/ESRI","https:\/\/www.linkedin.com\/company\/5311\/"]},{"@type":"Person","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/dec789ad68db472c6018c1c9068998be","name":"Jonathan Murphy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.esri.com\/arcgis-blog\/#\/schema\/person\/image\/","url":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2020\/04\/J_Mu-213x200.png","contentUrl":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2020\/04\/J_Mu-213x200.png","caption":"Jonathan Murphy"},"description":"Product Owner, UX Designer and Content Strategist on the Geodatabase team at Esri. Writer, musician, cockatiel whisperer and prolific world traveler.","url":"https:\/\/www.esri.com\/arcgis-blog\/author\/jonmurphy"}]}},"text_date":"March 25, 2015","author_name":"Jonathan Murphy","author_page":"https:\/\/www.esri.com\/arcgis-blog\/author\/jonmurphy","custom_image":"https:\/\/www.esri.com\/arcgis-blog\/app\/uploads\/2025\/08\/Newsroom-Keyart-Wide-1920-x-1080.jpg","primary_product":false,"tag_data":[{"term_id":25351,"name":"Big Data","slug":"big-data","term_group":0,"term_taxonomy_id":25351,"taxonomy":"post_tag","description":"","parent":0,"count":36,"filter":"raw"},{"term_id":25361,"name":"binning","slug":"binning","term_group":0,"term_taxonomy_id":25361,"taxonomy":"post_tag","description":"","parent":0,"count":10,"filter":"raw"},{"term_id":25371,"name":"geodata","slug":"geodata","term_group":0,"term_taxonomy_id":25371,"taxonomy":"post_tag","description":"","parent":0,"count":10,"filter":"raw"},{"term_id":25381,"name":"geodatabase","slug":"geodatabase","term_group":0,"term_taxonomy_id":25381,"taxonomy":"post_tag","description":"","parent":0,"count":48,"filter":"raw"},{"term_id":25391,"name":"Hadoop","slug":"hadoop","term_group":0,"term_taxonomy_id":25391,"taxonomy":"post_tag","description":"","parent":0,"count":3,"filter":"raw"}],"category_data":[{"term_id":23851,"name":"Data Management","slug":"data-management","term_group":0,"term_taxonomy_id":23851,"taxonomy":"category","description":"","parent":0,"count":920,"filter":"raw"}],"product_data":[],"primary_product_link":"https:\/\/www.esri.com\/arcgis-blog\/","_links":{"self":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/66991","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/users\/3981"}],"replies":[{"embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/comments?post=66991"}],"version-history":[{"count":0,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/blog\/66991\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/media?parent=66991"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/categories?post=66991"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/tags?post=66991"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/industry?post=66991"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/www.esri.com\/arcgis-blog\/wp-json\/wp\/v2\/product?post=66991"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}