ArcGIS Blog

Dec 09, 2021

Techniques for visualizing high density data on the web

By Kristian Ekenes

At the 4.22 release of the ArcGIS API for JavaScript, we added a sub-chapter of the Visualization guide that teaches you how to visualize high density data in meaningful ways.

"High density data" is a new sub-chapter in the Visualization guide of the ArcGIS API for JavaScript. You should consult these pages when working with large datasets containing many overlapping features.

Large, dense datasets are difficult to visualize well. These datasets typically involve overlapping features, which make it difficult or even impossible to see spatial patterns in raw data.

Thanks to major performance improvements over the last few years, the JS API can now render hundreds of thousands (even millions) of features with fast performance. These improvements prompted the need to highlight appropriate and effective ways to make sense of large, dense datasets.

The High density data visualization guide outlines seven approaches for visualizing large amounts of data. These topics are grouped by client-side and server-side approaches.

Each page provides a brief definition of the technique, describes why and when you should use it, and steps through two or more live examples that demonstrate how to practically use the technique.

Each page defines the technique (e.g. "What is bloom?") and provides examples about how to use it in the ArcGIS API for JavaScript.

The following is a brief overview of each new topic:

Clustering

Clustering is a method of reducing points by grouping them into clusters based on their spatial proximity to one another. Typically, clusters are proportionally sized based on the number of features within each cluster.

Large point layers can be deceptive. What appears to be just a few points can in reality be several thousand. Clustering allows you to visually represent large numbers of points in relatively small areas, making this an effective way to show areas where many points stack on top of one another.

A clustered point layer of global power plants. Clustering allows you to view spatial patterns in points that may not be possible to see otherwise. — Global power plants clustered by screen space. Clustering allows you to view spatial patterns in points that may not be possible to see otherwise.

Heatmap

A Heatmap renders point features as a raster surface, emphasizing areas with a higher density of points along a continuous color ramp.

Heatmaps can be used as an alternative to clustering, opacity, and bloom to visualize overlapping point features. Unlike these alternative techniques for visualizing density, you can use a heatmap to weight the density of points based on a data value. This may reveal a different pattern than density calculated using location alone.

Fatal traffic accidents involving alcohol. This is a point layer rendered as a heatmap to easily see areas where more fatalities occur than others. Heatmaps make the patterns obvious in ways only displaying raw locations cannot.

Opacity

Clustering and heatmap are only available for point layers. However, when polygons overlap, you can use per feature opacity to visualize their density.

Any layer with lots of overlapping features can be effectively visualized by setting a highly transparent symbol on all features (at least 90-99 percent transparency works best). This is particularly effective for showing areas where many polygons and polylines stack on top of one another.

Flash flood warning density in the United States. Using opacity to visualize density has a similar effect as heatmap. You can use this technique for any 2D geometry type.

Bloom

Bloom is a visual effect that brightens symbols representing a layer’s features, making them appear to glow. This has an additive effect so areas where more features overlap will have a brighter and more intense glow. This makes bloom an effective way for visualizing dense datasets, especially against dark backgrounds.

Aggregation

Aggregation allows you to summarize (or aggregate) large datasets with many features to layers with fewer features. This is typically done by summarizing points within polygons where each polygon visualizes the number of points contained in the polygon.

There are a couple of scenarios where this technique can be more effective than others:

1. The point dataset is too large to cluster client-side. Some point datasets are so large they cannot reasonably be loaded to the browser and visualized with good performance. Aggregating points to a polygon layer allows you to represent the data in a performant way.

2. Data can be summarized within irregular polygon boundaries. You may be required to summarize point data to meaningful, predefined polygon boundaries, such as counties, congressional districts, school districts, or police precincts. Clustering is always handled in screen space without regard for geopolitical boundaries. There are scenarios where summarizing by predefined irregular polygons is required for policy makers.

Earthquakes aggregated to hexbins. Hexbins are a great way to preprocess continuous data that doesn't respect geopolitical boundaries.

Homicides aggregated to police precincts. You may be required to summarize large datasets to meaningful political boundaries.

Thinning

Thinning is a method of decluttering the view by removing features that overlap one another. This is helpful when many features overlap and you want to display uncluttered data, but don’t necessarily need to communicate its density.

Roads and highways in Singapore. Major features such as highways and freeways should be displayed at small scales. Smaller, more detailed features should only display at large scales.

Visible scale range

Some large datasets can’t be reasonably visualized at certain scales. For example, it doesn’t make sense to display census tracts at scales where you see the whole globe because tracts typically represent neighborhoods and small communities. Many polygons at that scale would be smaller than a pixel, making them useless to the end user.

Setting visible scale range on your layers also helps reduce the initial data download to the browser. Don’t display too much data if you don’t have to!

Showing Census tracts may not be best at the initial map scale. Summarizing a variable by state, then county, then tracts can be more effective depending on the scales required for display.

Conclusion

At the bottom of each page, you’ll find the following chart, which summarizes which technique you should (or can) use given geometry type (point, lines, polygons, mesh), view type (2D vs 3D), and whether the data can all be loaded to the browser (client-side), or requires server-side preprocessing.

This API Support table is located at the bottom of each guide to help you understand whether the technique you read about is appropriate for the scenario you are considering. It can also help guide you to more appropriate techniques.

Be sure to check out the high density data visualization guides. I’d love to hear your feedback, including additional ideas that haven’t been covered.

Kristian Ekenes

Kristian Ekenes is a Principal Product Engineer at Esri specializing in data visualization on the web. He works on the ArcGIS Maps SDK for JavaScript, ArcGIS Arcade, and Map Viewer in ArcGIS Online. Kristian's work focuses on researching and developing new and innovative data visualization capabilities of geospatial data in web maps, Arcade integration in web maps, and applications of generative AI assistants in web maps. Prior to joining Esri, he worked as a GIS Specialist for an environmental consulting company. Kristian has degrees from Brigham Young University and Arizona State University.

Article Discussion:

0 Comments

Oldest

Newest

Inline Feedbacks

View all comments

April 22, 2021 | Kristian Ekenes | Developers

Data visualization in the ArcGIS API for JavaScript
December 9, 2021 | Julie Powell | Developers

What’s new in the ArcGIS API for JavaScript (version 4.22)
April 27, 2020 | Kristian Ekenes | Mapping

Mapping large datasets on the web