Information on supported data formats

This page provides more detailed information about each of the supported data formats in NCL. Information about referencing files and file variables is provided in the Files and file variables section of the reference guide. Everything discussed in that section applies to all supported data formats regardless of type. The discussion here focuses on specific conventions and limitations of each format and its NCL implementation.

NCL's supported data formats are:

netCDF - network common data format

Online documentation for netCDF is available at http://www.unidata.ucar.edu/packages/netcdf/index.html.

Nearly all the netCDF features are supported by NCL because NCL's data model is patterned after netCDF. The procedure filedimdef can be used to create file dimensions that are unlimited, and the procedure filevardef can be used to pre-define variables.

Strings are not a supported netCDF type. NCL maps all character attributes into type string for convenience. The conversion function stringtochar must be used to write string data into netCDF files. Also errors can be generated if an attempt to write a longer string to an existing attribute is attempted.

Also since netCDF does not support 64-bit integers, these are not supported either.

Global file attributes can be written to netCDF files by assigning attributes to the variable that references the file, in the same way that attributes are assigned to normal NCL variables. Doing this causes NCL to write the attribute into the file.

GRIB - Grids In Binary

Online documentation for GRIB is available at ftp://nic.fb4.noaa.gov/pub/nws/nmc/docs/gribed1.

Support for GRIB from NCL was added with release 4.0.1. It is mostly complete with a few exceptions covered here.

GRIB is very problematic file format. There is no official API for writing GRIB, so very often GRIB files are written incorrectly. This can cause minor to severe problems when reading these data with NCL. NCL has been tested on NCEP, FSL, and ECMWF GRIB data. ECMWF data to date have caused the most problems. ECMWF changes their parameter table regularly and without notice. This can cause some parameters to be mislabeled in NCL. If using ECMWF data, please contact them for information on their data. NCEP and FSL data typically do not cause NCL to behave badly. Data from other centers have been tested, but a comprehensive list is not available. Quite frequently though data from other centers have contained problems in the GDS and PDS sections. If data from other centers use an extended parameter table, NCL may not be able to correctly provide names and unit information for all the variables in the file. If having problems reading GRIB files, please contact ncargfx@ncar.ucar.edu

First and most important, GRIB is a read only format. GRIB files cannot be created using NCL. The main reason for this is that GRIB files must contain the ids of the meteorological center writing the data and the model used to generate the data. Since NCL is not a model and could be used at locations that don't have a center id, NCL has been designed to only read GRIB data.

Another important item to understand about GRIB with respect to reading GRIB from within NCL is that GRIB often stores grids in various projection and computational grids that do not adhere to NCL's coordinate variable constraints. For example, one common GRIB grid type is a rectilinear grid of points sampled from a stereographic projection, commonly referred to as a tangent grid. In this situation, a single monotonic coordinate variable is not possible because the coordinates of each grid point are defined to be a function of the index (i.e. lat = f(i,j) and lon = g(i,j)). NCL handles this by providing two 2D arrays for grids like this. One array contains all of the lat values for every index, and the other contains all the lon values.

NCL presents GRIB variables as having up to five dimensions. The dimensions are ordered [initial_time]x[forecast_time]x[level]x[gridx]x[gridy]. GRIB files contain single 2D slabs of data with headers that state the initial time, forecast time, and level to which the slab belongs. NCL, when opening a GRIB file, scans through all the records in the file and sorts them by variable type, initial time, forecast time, level type, and finally grid type. This allows NCL to present GRIB data in a fashion analogous to netCDF. If only one record for a specific variable is in the file, the dimension of the variable will only be [gridx]x[gridy].

Since GRIB files can have the same variable written with different grids, different time range indicators, and different pressure level indicators, the NCL implementation has to use some unique naming conventions for variables, dimension, and grids so that there would be unique variable names. For example, consider the variable TMP (temperature). One GRIB file could contain the variable with many different variations. One record could be the average temperature, another could be the difference in temperature from one time to the next, and yet another could be the temperature at tropopause. Clearly these are each different variables in the file, but GRIB identifies them as TMP. Therefore, NCL has conventions for distinguishing these different variables.

Before continuing with NCL naming conventions, note that the NCAR Graphics distribution comes with a sample GRIB file. The following NCL statement will access this sample file if your site administrator has installed the data.

Further discussions of NCL naming conventions will refer to variables in this sample data file.

The GRIB file variable names are constructed from information in the GRIB record's "product description section." The first portion of any variable name is its abbreviation taken directly from NCEP's GRIB specification document. If the variable is unrecognized -- which could happen because centers often change their GRIB files without telling anyone -- the variable is given a name "VAR" plus its variable identifier number. The variable name is followed by an underscore ("_") followed by a number that represents the GRIB grid number the data are written in. After this, there is another underscore followed by an abbreviation of the type of level coordinates the data exist in. After this, there is an optional string that communicates what type of time range indicator is used with the data. Just as with the variable name, if the time indicator is unrecognized, then the time range indicator number is concatenated to the end of the variable name.

The best way to put this all together is to look at some examples from the file above. The following are the temperature variables defined in this GRIB file. There are six different temperature variables. In each case the variable name starts with its abbreviation from the GRIB document TABLE 2, in this case "TMP". This set of variables is defined over two grids, grid 6 is a 2385-point 53x45 N. Hemisphere polar stereographic grid, and grid 101 is a 10283-point 113x91 N. Hemisphere polar stereographic grid. The final strings here are the level abbreviations also taken from the GRIB document in TABLE 3 or TABLE 3a. "TRO" stands for "tropopause level," "ISBL" stands for "isobaric level," "GPML" stands for "fixed height level," "SIGL" stands for "sigma level," and "SIGY" stands for "layer between two sigma levels."

Each variable also has attributes defined from the GRIB product description set. These typically are "center," "units," and a "long_name." None of these variables has a special time range indicator, meaning each value is a "snapshot" of the data at the given time. In the example data file, only a few variables have a time range indicator; one variable is named "A_PCP_105_SFC_acc." This variable references total precipitation on grid 105 at the surface. The "_acc" component implies that each value in the data is an accumulation. Currently in NCL only "_ave" for average, "_acc" for accumulation, and "_dif" for difference have string values; any other time range indicators have an underscore followed by the time range indicator number from the product description section of the variable's GRIB record.

Dimension names and coordinate variables also have a naming convention. The most important naming convention is how the grid coordinate variables and dimensions are named. If a grid is defined on a type of grid where the coordinates for the grid points can be represented with two monotonic coordinate variables, the dimension and grid coordinate variables are named the same. These kinds of grids include Mercator and Cylindrical Equidistant lat/lon grids.

Dimension names are named "lat" and "lon" with an underscore followed by the GRIB grid number and can be used as normal NCL coordinate variables. Grid types that do not fit into the NCL coordinate variable convention are named differently than their respective dimensions. In this second case, the dimension is named "gridx" and "gridy" followed by the GRIB grid number. The coordinates for each grid point are provided by NCL as variables. The are two 2D arrays containing either all the latitude points or all the longitude points for a specific grid type. These variables are named "gridlat" and "gridlon" followed by an underscore and the GRIB grid number. These variables have attributes that describe the type of grid and the coordinates at the corners of the grid. The following is an excerpt from the print of the GRIB file "/grb/ced1.lf00.t00z.eta.grb" showing what these coordinate variables look like.

Level coordinate variables and dimensions begin with the prefix "lv_" followed by the level abbreviation from the GRIB document followed by the dimension number. The dimension number is needed because often the same coordinate variable with the same units will exist for different variables with different values. The addition of the dimension number following the dimension name creates a unique dimension and coordinate variable combination.

The forecast_time coordinate is in hours, and the initial_time coordinate variable is a string with the month, day, year, hour, and minute of the initial time in the following format: MM/DD/YYYY (HH:MM).

HDF - Hierarchical Data Format - Scientific Data Sets (SDS) only

Online documentation for HDF is available at http://hdf.ncsa.uiuc.edu/.

Note: NCL currently does not handle HDF-EOS data. Support for the conventions used in HDF-EOS data is under development

HDF is somewhat limited from NCL. NCL only reads data written using the SDS interface. Importing HDF files is entirely analogous to importing netCDF files as described above, with some minor exceptions. First, if variable names contain spaces or non-alphanumeric characters, these characters are replaced with the '_' character when listed from NCL and are referenced from NCL in this fashion.

Writing variables to HDF files from NCL is somewhat limited currently. Only the values of variables and their names are written to files in the HDF format. Attributes and dimension names cannot currently be written from NCL. Attributes and dimension names will generate warning messages when attempting to write an NCL variable to an HDF file.

There is currently no way to access 8-bit and 24-bit HDF images from NCL. There is also no way to access VGROUP or VDATA HDF data classes.

CCM - Community Climate Model History Tape Format

The CCM format is a format, originally in CRAY COS blocked form, written by the NCAR CCM1, CCM2, and CCM3 global climate models. It is also possible to have IEEE CCM files. Currently, NCL does not support IEEE CCM files due to lack of documentation. It is possible to use the public domain tool called "ccm2nc" (available on almost all SCD computers; "man ccm2nc") to convert these files to netCDF. NCL can then reference the netCDF file(s). If not on SCD machines then the "ccm2nc" software can be downloaded from http://neit.cgd.ucar.edu/cms/ccm3/tools/

CCM files are pretty straightforward (no special naming convention is needed as with GRIB files); the variable names and unit information are stored as character data in the CCM files. When a CCM file is opened, NCL scans the file and creates an index of all the data in the files. This can be expensive for large files, but it facilitates quickly accessing individual variables of the file. Because this can be expensive, you should avoid repeatedly calling addfile on the same file whenever possible.

For more information on the CCM model and CCM file format, see the CCM3 User's Guide.


Reference Manual Control Panel

NG4.1 Home, Index, Examples, Glossary, Feedback, Ref Contents, Ref WhereAmI?