Most demographic, economic, health, and education maps start with tabular data of some kind. This blog post covers multiple ways to turn a flat file of raw data into a feature layer. Feature layers allow you to do spatial analysis, and to create web maps that can be shared in apps and story maps.
Bring in a Spreadsheet with Location Data and Geocode to Create Points
Often times, data analysts and GIS specialists need to turn spreadsheets on their local computer into a feature layer. Here, I have a local spreadsheet on my computer that contains information about Medicare spending per patient for various hospitals. The fields that contain location data for each hospital are called Address, City, State, and Zip_Code.
Within My Content in ArcGIS Online, go to “Add Item” – in this case, I’ll choose “From my computer” and navigate to the CSV file.
ArcGIS Online will then prompt you to publish this file as a hosted feature layer, and will pick up on the fact that your data has address fields in it, and offer to use those as the location fields. Check to make sure that Address, City, State, and Zip are matched as desired. In my case, my field is named “ZIP_Code” so I manually set it to use that field as the zip:
Make sure the title and search tags are what you want, and then click Add Item. Depending on how many records your data has, it can take a few minutes to publish. Once published, a dialog box appears that tells you how many of your locations matched, and you’ll get a chance to review any unmatched cases.
Once the item has been created, ArcGIS Online will automatically bring you to the Item Details Page of your new points feature layer!
Note: if your spreadsheet has lat/long coordinate data instead of address fields, the process is very similar.
Bring in a Spreadsheet and Join to an Existing Feature Layer
You can also simply bring in your data as a table, then join to an existing feature layer. When would it be better to bring your spreadsheet in as a table instead of geocoding it immediately?
- If you have a spreadsheet without addresses or lat/long coordinates, but still need to bring in this data to your GIS workflow. Maybe you have hospital IDs, campus names, or some other specific attribute field that will not be picked up by the geocoder, but can be joined by attributes to an existing points layer within your ArcGIS Online Organization.
- If you want to work with lines or polygons instead of points. Check out the Boundaries category of the Living Atlas for polygon feature layers of countries, states and provinces, telephone area codes, congressional districts, world time zones, counties, watersheds, and more.
Here I have a spreadsheet that I downloaded from the County Health Rankings Data and Documentation site that was named “analytic_data2019_0.csv” during the download. Here are the first few rows of my table:
The first row in the CSV is the descriptive name (such as “Premature death raw value”) which we’ll use as our aliases. The second row is the short variable name used for coding purposes in statistical packages that corresponds to the data dictionary (such as “v001_rawvalue”) that is recognized by others who use this dataset. If you plan to calculate any new fields or transform the data in any way, it will be much easier to work with these field names in formulas and macros, since they often follow a standard naming convention. I’m going to remove the top row of descriptive aliases for now, and save them elsewhere!
You might have noticed something else as well: This table contains records for the US as a whole, records for each state, and records for each county, pretty common for government datasets. Since I am just bringing in this a basic rectangular dataset, or flat file, as a table into ArcGIS Online, this is okay. Once I’m working with it, I can apply filters.
Similar to before, I’ll start from My Content, click Add Item -> From Computer, but select “add as table” this time instead of “Coordinates” or “Addresses or Places” as shown:
Again, once the item is generated, ArcGIS Online will bring you to the Item Details Page. From here, I’ll select Open in Map Viewer -> Add to New Map. Now I can filter out the records that I’m not interested in. First, with my table added to the Map Viewer, I can verify how many records this table has (3,194). Remember, it has a record for the whole US, for each state, and then for each county. Records for the U.S. and the states have a value of 0 in the field called countycode:
I can apply some filters to get down to only the records I want. First, I’ll say countycode is not 0 to remove any non-county records.
This gets me down to 3,142 records.
Subset to an Area of Interest
Many health and education policies are set at the state level. Data analysts working in these fields don’t need to work with all counties within the nation, just the ones in their own state. For example, someone at the Ohio Dept. of Health would be interested in the County Health Rankings for all Ohio counties, but having to work with the entire national dataset would be distracting at best. In fact, it would likely be overwhelming to the analyst, likely to slow down processing time, and file sizes would be larger than necessary. The best approach in this situation would be to apply a filter to only work with the data you want.
Back in the Filter tab, I’ll click Add another expression to subset even further, and leave the top drop down option to say “Display features in the layer that match all of the following expressions”:
This gets my table down to 88 records, the number of counties in Ohio. If you need to filter your table in a slightly more complex way, learn about even more ways to apply filters.
From there, I’ll click Add -> Browse Living Atlas Layers, and add the USA Counties (Generalized) layer of county polygons to my map. Apply a filter to the counties layer as well, so that it only shows counties in Ohio, not all counties in the U.S. Doing this will decrease running time for our Join, and also consume fewer analysis credits!
My next step is to join my table to the county boundaries. Click Analysis -> Summarize Data -> Join Features.
Next, select the target layer and and the table to join to the target layer. To do an attribute join, select “Choose the fields to match” in #3, below. In my case, I’ll join by FIPS, which stands for Federal Information Processing Standard. FIPS codes are widely-recognized codes for states, counties, tracts, congressional districts, school districts, and many other administrative geography levels. Next, choose a name for your new layer in #5, in my case, “Ohio County Health Rankings 2019.” Note: If I were joining all counties in the U.S., I would have to uncheck “Use Current Map Extent” so that counties in Alaska, Hawaii, and Puerto Rico will also be included even though they’re not displayed in the map’s current extent.
When Joining by a Match ID such as FIPS
- Watch leading zeros for states with single-digit FIPS codes: Alabama (01), Alaska (02), Arizona (04), Arkansas (05), California (06), Colorado (08), and Connecticut (09). Often these leading zeros get dropped because computers are quick to recognize this field as a numeric rather than string field. If your resulting layer looks like this, it’s because one place has the leading zeros and one does not, and therefore values do not match.
When Joining by Name
- When joining on County Name, join by both county and state. Unlike FIPS codes, county names are not unique across states. For example, there are 26 states that have a Franklin County. Also, make sure you’re consistently using state abbreviations or state names in both the target layer and the table.
- Beware of Country names.
- Some sources list the official country name rather than the commonly-understood name, such as “Russian Federation” vs. “Russia”
- Watch for abbreviations such as “St. Martin” vs. “Saint Martin”
- There are constant name changes to country names. Two recently were “Cabo Verde” vs. “Cape Verde” and “North Macedonia” vs. “Macedonia”
- New countries form all the time! I remember a time before South Sudan.
Running the Join
Next, click Run Analysis, and be patient. Your join could take a few minutes depending on how many fields and records your table has, whether you’re joining to polygons (more complex), lines, or points (least complex), and finally, where the two datasets are stored. If you’re running a join on a layer that you don’t own, such as a Living Atlas layer, you might see a warning pop up that says, “performing analysis using Living Atlas layers can result in greater execution times” since you’re performing a join from multiple data sources.
In the end, you should have a new layer in your map that has the tabular data joined to the polygon layer. You can then symbolize, configure pop-ups and labels, and fill out the Item Details for your new layer.
Once you bring in your data to ArcGIS Online, either through geocoding to create a points feature layer by joining a table to an existing feature layer, feel free to use this checklist to evaluate your data and do some basic quality assurance.
Remember those nice display names we saved from earlier? We’ll now add those as our field aliases. There are two major benefits to adding aliases: one is a benefit for the viewer of your map, and the other is a benefit for any users of this layer. First, they make any auto-generated and pop-ups legends instantly more meaningful since the alias is used in legends and pop-ups first, the field name is used only if there are no aliases. Second, aliases help anyone working with the layer (including you!) while viewing the attribute table, trying to figure out what this data is about, and making decisions for symbolizing their map. Aliases turn spreadsheet data into a shareable dataset.
There are two main ways to add aliases:
- From within the Web Map, go to Configure Pop-up -> Configure Attributes and then type in the aliases for your fields of interest. Here I can type in the display name I want in the column called Field Alias:
2. From within the Item Details Page, go to Data -> Fields, then click on the field you wish to change, and you’ll be taken to that field’s properties. Click on the pencil icon to edit the alias and type in your desired alias for that field.
You can even batch-assign all aliases at once by using this nice AliasAssigner.py tool written by my colleague Lisa Berry.
Take it one step further and add field descriptions in addition to aliases, to provide even more detail about specific attribute fields such as detailed definitions, sources, and year(s) of data in a given field.
Now Create Information Products with Your Data
Now it’s time have fun performing analysis or creating a web map! Smart Mapping and Arcade capabilities can help you make a clear and polished web map with purposeful symbology, pop-ups, and labels. From there, you can embed your web map into a website, or put it into a story map or configurable app.
Next, document your work to make your items easy to find, understand, and use by others. Part of documenting layers is managing settings such as sharing, enabling delete protection, setting the default extent, marking something as authoritative on behalf of your organization, or allowing others to edit.
For more information on getting started with data, see the official help documentation. If you run into any questions when turning your raw data into a feature layer, feel free to post them on GeoNet.
Note: Before bringing in data, verify that you are signed in and that your account has a user type that is able to publish hosted feature layers (user types that are able to publish are Creator, GIS Professional, and Insights Analyst). You’ll also need some analysis credits, as both of the methods above consume credits based on how many records your data has.