By Charlie Frye, Esri Chief Cartographer
Those of you who regularly use the classification dialog will have noticed a new classification method was added in version 9.2. Now
available for all data is the geometrical interval classification method which was called “smart quantiles” when it was originally introduced in the Esri Geostatistical Analyst extension.
This classification method was used for visualizing continuous data and to provide an alternative to the Natural Breaks (Jenks), quantiles, and really any variance minimized (within classes) classification method. The specific benefit of the geometrical intervals classification is that it works reasonably well on data that are not distributed normally. In
fact, this method was designed to work on data that are heavily skewed by a preponderance of duplicate values, e.g., 35% of the features have a value of 2.0. For example, it could be used on a rainfall data in which only 15 out of 100 weather stations have recorded precipitation and the rest have no recorded precipitation, so their attribute values are zero.
The Geostatistical Analyst team’s intent was to provide a method that was better than quantiles for visualizing prediction surfaces, which often do not have a normal data distribution. In terms of how it works, the geometrical interval method is not much different than a progression classification (binary, geometric, logarithmic, etc.), but it does add the wrinkle of the coefficient, which is explained in the online help. Since this method is really intended to be used as part of a data visualization process, it should be noted that it may not be very useful as a data presentation method unless there is a compelling quantitative reason. For instance, we would have a tough time just explaining how the classification method works; so, at least, we recommend including a histogram with the class breaks overlaid to show readers what the classes mean relative to the data’s distribution.
We would also recommend for data presentation it is usually better (from a reader comprehension standpoint) to normalize skewed raw data and use a simpler and more easily understood classification method (like quantiles or equal intervals), or even Jenks, which is relatively easy to explain. If needed, use the “No Data” option in conjunction with normalizing the data.
We also Googled “Geometrical Interval” and found one case of geometrical interval classification being used inappropriately because the mapmaker thought it was helpful for identifying outliers, which is definitely not something this classification method does. We also found some older definitions for smart quantiles in the online help; the help topic referenced above is what we consider the definitive reference.