ArcGIS Blog

Analytics

ArcGIS Pro

Accessing Multidimensional Scientific Data using Python

By Kevin Butler

 

With the 10.3 release, a new Python library, netCDF4, began shipping as part of the ArcGIS platform.  netCDF4 allows you to easily inspect, read, aggregate and write netCDF files.  NetCDF (Network Common Data Form) is one of the most important formats for storing and sharing scientific data.

The ArcGIS platform has had geoprocessing tools which read and write netCDF data since the 9.x release.  However, there may be times when you may want to access or create netCDF data directly using Python.  There are four ways of interacting with netCDF files in ArcGIS; geoprocessing tools, the NetCDFFileProperties ArcPy class, the new netCDF Python module, and the multidimensional mosaic dataset.  Which method you use depends on what you are trying to accomplish.  A summary of different ways of interacting with netCDF files appears in the table below.  This blog post will focus on the new netCDF4 Python library.

The netCDF4 library makes it easy for Python developers to read and write netCDF files.  For example, this code snippet opens a netCDF file, determines its type, and prints the first data value:

[code language=”python”]
>>>import netCDF4           # the module name is case sensitive
>>>d = netCDF4.Dataset(r’c:datatemperature.nc’, ‘r’)
>>> d.data_model
>>>print(d.variables[‘tmin’][0][0][0])   # tmin[year][lat][lon]

‘NETCDF3_CLASSIC’
-23.2861
[/code]

NetCDF4 stores the data read from a netCDF file in a numPy array.  This means you have access to the powerful slicing syntax of numPy arrays.  Slicing allows you to extract part of the data by specifying indices.  The variable tmin in the example above has three dimensions; year, latitude and longitude.  You can specify an index (or a range of indices) to slice the three dimensional data cube into a smaller cube.  This code snippet extracts the first five years of data for the variable tmin and prints summary statistics:

[code language=”python”]
>>>A = d.variables[‘tmin’][0:5][:][:]
>>>print(“Minimum = %.2f, Maximum = %.2f, Mean = %.2f” % (A.min(), A.max(), A.mean()))

Minimum = -60.77, Maximum = 27.69, Mean = 0.41
[/code]

Here are some potential uses of the netCDF module:

  • Build custom geoprocessing tools that process netCDF data (see the Create Space Time Cube in the Space-Time Patten Mining toolbox for an example)
  • Perform advance slicing (sub-setting) of a netCDF file.  For example, this statement reads data for every other year for the variable tmin:
    [code language=”python”]
    A = d.variables[‘tmin’][0::2][:][:]
    [/code]Skipping over data is sometime referred to as specifying a stride.
  • Read netCDF files which contain groups. The latest netCDF data model, netCDF-4, supports organizing variables, dimension and attribute into hierarchical groups within the file.  The Multidimension tools were designed and built based on an earlier version of netCDF data model which didn’t support groups and therefore can only access the first group in the file.
  • Access scientific data stored on a remote server.  The netCDF4 module supports the OPeNDAP protocol.  OPeNDAP is widely used in the earth sciences to deliver scientific data.  To access data stored on a remote server, you can specify an OPeNDAP URL in place of a filename on the Dataset method.  For example, this code will open a dataset on a remote server:
    [code language=”python”]
    >>>d = netCDF4.Dataset(r’http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Puerto_Rico_0p5deg/GFS_Puerto_Rico_0p5deg_20150408_1800.grib2/GC’, ‘r’)
    [/code]

See http://unidata.github.io/netcdf4-python/ for documentation of the netCDF4 module.

** Note: Previously this blog post erroneously reported that the netCDF4 python library began shipping as part of the ArcGIS platform at version 10.2.  It did not ship with the platform until version 10.3. **

Methods of interacting with netCDF files:

Method

Use
this method to …

Benefits
and limitations

Example

Geoprocessing tools (all of the tools located in the Multidimension toolbox)

~ create a map, table  or chart from netCDF data
~ chain netCDF data to other tools in a GIS workflow
~ work with a single slice of the data
~ supports reading and writing netCDF files
easy access through the familiar geoprocessing tool UI
works with one slice of the data at a time

Explore spatial patterns in
precipitation patterns

netCDF4 Python module

~  access netCDF data directly using Python

~ have more control  over the structure and contents of a netCDF file

~  advanced slicing of data

~  easy access to a number of numPy functions

Build a custom geoprocessing tool to combine
several netCDF files into a single file

netCDFFileProperties ArcPy Class

~ explore the structure of a netCDF file

~ easy way to access the properties of
variables and dimensions

no access to the data

Create an inventory of a large
collection of netCDF files

Multidimensional mosaic dataset

~ temporally and/or spatially aggregate a large collection of
netCDF files

~ perform ‘on-the-fly’ analysis

~ serve data as a service

~ only works with regularly gridded data

~ can manage very large collections of files

Aggregate model output from different
regions into one seamless dataset

Share this article

Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments