NCL variables overview

Properties of variables

Variable names must begin with an alphabetic character, but they can contain any mix of numeric and alphabetic characters. The underscore ('_') is also allowed. Variable names are also case sensitive. The maximum variable name length is currently 256 characters. Variables reference arrays of multi-dimensional data. These data can be described by variable attributes, named dimensions, and coordinate variables. Variables can also reference files and graphical objects.

The following are examples of unique variable names:

Variables, like in other programming languages, are textual names that reference data. In NCL, somewhat like Fortran, variables can be created without previously defining them. This is call-implicit instantiation. Unlike Fortran though, the type of a variable is based on what type of value is assigned to it, not the name. Unlike other languages, variables in NCL can be deleted, or changed from defined to undefined.

NCL has been designed with special features that allow ancillary information to be "attached" to a variable programmatically. NCL provides a unique syntax for storing and retrieving these ancillary data values. These ancillary data are divided up into three categories: variable attributes, dimensions, and coordinates. Variables can have an unlimited number of attributes assigned to them. Each dimension of a variable can have a name associated with it and optionally a coordinate variable. Variables become defined when an undefined name appears on the left side of an assignment statement.

There are three types of variables referenced in this document. The first and most obvious is the term variable which is defined as a textual reference to a multi-dimensional or scalar value. Second, there is the term variable that references a file, which is defined as a variable that is assigned the return value of the addfile function. These variables provide references to open files. Finally, the term file variable is defined to be a variable in a file that references a multi-dimensional data value. These terms will be used throughout this section of the reference guide to distinguish between different kinds of variables.

Attributes

Attributes are descriptive pieces of information that can be associated with an already existing variable, or file variable. They are very useful for communicating information to the user about specific data. Attributes can be assigned single-dimensioned arrays, but not files. Variable attributes are referenced by entering the variable name, followed by '@', followed by the name to be used to reference the attribute. If the attribute is not defined, then an error message is displayed. An attribute is created by referencing it on the left side of an assignment statement and assigning a value to it. The following are examples of creating the attributes units, long_name, and _FillValue in the variable named temperature:

Attributes can be used in expressions and subscripted in the same fashion as variables. Common uses of attributes are to store the units that data are stored in, and to store names and text that could be useful.

Missing values

The attribute _FillValue is a special reserved attribute name that denotes what values stored in a variable should be considered missing values. Whenever the _FillValue attribute is assigned a new value, every occurrence of the previous value in the variable temperature, in the above example, is replaced with the new _FillValue. The _FillValue attribute must be the same type as the data type referenced by the variable. The procedure delete is used to remove the missing value attribute. Once removed, all of the previous elements of the variable that were treated as missing values are treated as normal values.

The _FillValue attribute has many important uses in NCL. First and most important is how missing values are handled in expressions. When missing values appear in terms of an NCL expression, they are ignored for the specific element that contains the missing value. Consider the following example:

In this example, the constant array (/ 27.2, -10.0/) is assigned to the variable a. Next the value of -10.0 is assigned as the _FillValue attribute. When the expression is evaluated, the element equal to -10.0 in the array is ignored, and the result is the array referenced by b has -10.0 as its _FillValue. See Expressions and missing values for more discussion.

The default missing values for the various types are:

Dimensions

Dimensions define the shape and size of the data referenced by variables. In NCL, dimensions are ordered using row x column ordering, which is identical to the C programming language. By convention, dimensions are numbered from 0 to n-1 where n is the number of dimensions of the data referenced. The dimension numbers are significant because NCL allows names to be associated with dimensions. This in turn facilitates coordinate subscripting and named subscripting. A variable dimension is referenced by entering the variable name, followed by the '!' character, followed by the dimension number being referenced. If the dimension has been assigned a name, then this reference returns the name. To assign or change a name to a dimension, simply assign a string to the dimension number in the following fashion: The previous example is valid only if temperature has three or more dimensions.

Coordinate variables

Coordinate variables are single-dimension arrays that have the same name and size as the dimension they are assigned to. These arrays represent the data coordinates for each index in the named dimension. When the values in these arrays are monotonically increasing or decreasing, they can be used in coordinate subscripting. If they are not monotonic or contain missing values, then coordinate subscripting will not work. The '&' operator is used to reference and assign coordinate variables. In order to assign a coordinate variable to a dimension, the dimension must have a name. These examples show assignment of variables to coordinate variables:

String references

Sometimes it is impossible to know the names of the attributes and coordinates before writing a script, or these names may vary from variable to variable. To solve this problem, string variables can be used to reference attributes and coordinates by enclosing the variable reference within dollar signs '$'. The following are examples of this:

Variables used as parameters to functions and procedures

When functions and procedures are called, in NCL, the parameter passing mechanism is called pass-by-reference. This means, just like FORTRAN, that changes made within a function or procedure to a parameter are also applied to the variable in the calling environment. With NCL, however, there are some differences which must be considered when passing variables as parameters. In NCL changes to named dimensions, coordinate variables and attributes also affect the variable in the calling environment. Furthermore, if the variable is subscripted prior to calling a function or procedure the values are remapped back into the original variable once the function or procedure is terminated.

In the following example the variable a is subscripted note how the assignments in the procedure set are propagated back to the calling environment:

	
	procedure set(x)
	begin
		x = 1
		x!0 = "dim1"
		x@_FillValue = 1
	end

	a = (/(/1,2,3/),(/4,5,6/),(/7,8,9/)/)

	set(a(1,:))

	print(a)

	
It is important to consider this functionality whenever writing an NCL function or procedure that makes any kind of assignments to the input parameters.

Subscripts

There are three types of subscripting in NCL. Standard is similar to the array subscripting available in Fortran 90. One very important item to note is that NCL dimension indexes start at 0 and end at n-1. Second, coordinate subscripting uses the data in the coordinate variables for determining subsections of the array to select. Third, named subscripting uses the names of the dimensions to allow for array reordering operations. All three types of subscripting can be used in a single variable selection.

Standard subscripts

Standard subscripting provides the capability of selecting ranges and strides in addition to the ability to select data using a vector of integer indexes. All of this functionality is more or less duplicated from Fortran 90. The following is a simple example of a set of single subscripts for a three-dimensional variable with dimensions 5x6x7. Unlike Fortran, the array indexes begin at 0. Standard subscript indexes must always be integer; floating point numbers and strings are not accepted.

Note: Individual subscripts are separated by commas ',' and the entire subscript list is enclosed in parentheses.

A range subscript selection accepts a beginning and ending index separated by a colon ':'. Both the beginning index and the ending index are included in the selection, therefore the range is inclusive. Furthermore, if the start is greater than the end, then the selection reverses the ordering of the array. Some examples are: The first selection selects a 3x1x1 subsection of the array temperature, and the second selection selects a 3x2x2 subsection. In addition to the above style of selection, a stride can also be specified that causes the selection to skip over a given number of elements. For example, a value of 1 means that every element from the beginning of the range and the end of the range will be selected. With values greater than 1, the first index of the subscript is followed by the stride plus the current index. Therefore a value of 3 selects the first, fourth, seventh, and so on. The above selection uses strides to produce a 3x2x2 array.

There is no restriction on having the start of a subscript range be less than the end of the subscript range. When the start is greater than the end, a reverse selection is done, meaning the order output selection is reversed from the original variable. For example:

Another option for selection is to leave out the start, end, or both. This means that the start or end will default to the beginning or end respectively.

The first selection selects from the beginning to index 2 for the first dimension, the beginning to index 1 for the second dimension, and the final subscript selects from index 5 to the end for the 3rd dimension. The second example shows how the entire array can be selected.

The following uses the default range to reverse the ordering of each dimension using a negative stride:

Finally, a vector of integer indexes can be used as a subscript. As long as all of the entries in the vector are within the bounds of the given dimension, the vector could be any size. Vector subscripting allows a single index to be selected more than once. For example, consider the following array and its use of vector subscripting on the variable temperature:

This selection creates an array 6x6x7 which is actually bigger than the original. The first, second, and third indexes of the first dimension contain identical arrays. The vector must always be a single-dimensioned array of integers.

Coordinate subscripts

Coordinate subscripts use the coordinate variables associated with a variable to determine which indexes are used in the selection. When specifying a coordinate subscript, braces '{' and '}' indicate the start and end values of the coordinate variable that will be used to select the indexes. Essentially, the start and end values are "looked" up in the coordinate variable, and the indexes are used to make the subselection. The following are examples of coordinate subscripts. Note that coordinate and standard subscripting can be mixed in the same variable subscript. Also, stride is still specified as an integer stride. If the coordinate values used in the subscript do not exactly match values in the coordinate variable, all coordinate values that fall within the coordinate subscript range are selected. If the values do match, then they are selected in an inclusive fashion.

Coordinate subscripting only works when the coordinate variables assigned to the variables are monotonically increasing or decreasing. If an attempt is made to subscript a coordinate variable that is not monotonic, an error message is generated.

Named subscripting

Named subscripting is a means by which arrays can be reordered. Named subscripting requires that each dimension of the variable being subscripted is named. If one or more are not named, an error messages is printed. The following are examples of named subscripting. The dimension names of the variable temperature are "time", "lat", and "lon" for dimensions 0 through 2 respectively.

The first example "swaps" the lat and lon dimensions. The second example shows a similar dimension reordering but utilizes coordinate subscripting.

Using string reference with named subscripting

It is not necessary to "hard-code" the names of dimensions when using named subscripting. Alternatively, dollar signs '$' can be placed around a string variable. This causes NCL to use the variable's string value as the dimension name. The following shows how an array can be reordered without knowing the names of the dimensions:

Variable assignment

It is important to understand what happens when a variable is used in an assignment statement in NCL. The assignment statement functions differently depending on whether the variable being assigned to is currently undefined or defined. The assignment statement also functions differently depending on whether a variable or a value occupies the right side of the assignment.

When a variable appearing on the left side of an assignment has not be defined or was previously deleted, the assignment statement causes the variable to become defined and the data type and dimensionality of the variable is determined by the right side.

When a variable appearing on the left side is already defined, then the right side must have the same type, or be coercible to the type on the left, and the right side must have the same dimensionality.

Value-only assignment

Value-only assignments to variables are fairly straightforward. In essence, value-only assignments mean that the right side of the assignment is not a variable, it is the result of an expression, a value. In this case, if the left side variable reference was not defined prior to the assignment statement, the variable on the left side becomes defined and references the value of the right side. No dimension names, coordinate variables or attributes other than _FillValue are assigned. If the right side of the expression does not contain any missing values, then _FillValue is not assigned either.

If the left side variable was defined prior to the assignment statement, then the value on the left side is assigned the value of the right side. If the left side is a subscripted reference to a variable, then the right side elements are mapped to the appropriate location in the left side variable. If the left side has any attributes, dimension names, or coordinate variables, they are left unchanged since only a value is being assigned to the left side variable. When the left side is defined, then the type of the right side and the dimensionality must match. However, there is one exception to the requirement that the dimension sizes of the left side and the right side match, a single scalar value can be assigned to more than one location. Consider the following example:

This example demonstrates the value of -1 being assigned to the first four elements of the variable a.

Variable-to-variable assignments

During variable-to-variable assignment attributes, coordinate variables and dimension names, in addition to actual multi-dimensional values, are assigned. Before discussing this type of assignment, it is important to note that the array designator characters '(/' and '/)' can be used when assigning one variable to another to force only the right side's value to be assigned to the left side and the right side's attributes, dimensions, and coordinates are ignored. Essentially using the array designator characters forces value-to-variables assignment. The following shows how the array designator characters can be used to do this:

Variable-to-variable assignment occurs when both the left side and the right side are variables. In this situation, the assignment statement also tries to assign attributes, dimension names, and coordinates of the right side to the left side.

The two simplest cases are:

In both these situations, all of the right side's attributes, coordinates, and dimension names are assigned to the left side. If the left side has the same dimension and coordinate names, then only the coordinate variable is overwritten with the value and attributes of the right side's coordinate variables. However, if the names of the dimension names do not match, a warning message is generated and the names and coordinate variables of the left side are overwritten. As far as attributes go, if the left side has attributes, then the left side's attribute list is merged with that of the right side. If the same attribute name appears on both the left and right sides, the right side's attribute overwrites the left side's. If the types of the attribute values do not match, you could have a type mismatch error.

The following are examples of some variable-to-variable assignment situations:

This first example shows assignment to an undefined variable and then shows the use of the array designator characters '(/' and '/)' to perform a value-only assignment.

This second example demonstrates a defined variable being assigned to a defined variable. Note the changes resulting from assignment to the dimension names, attribute values, and coordinate variables in variable a. These assignments that change the left side's coordinates and dimension names generate errors. When left and right dimension names are different, NCL considers this an error that the user should be warned about. To avoid these errors you can either make sure before assignment that the left and right sides have the same dimension names, or if you only want to assign a value and don't care about attributes, dimensions, and coordinate variables, you can enclose the right side using '(/' and '/)', which forces NCL to use only the value of the right side.

The remaining case is that when the left side is subscripted, only a portion of the target variable is being assigned to. The simplest case here is when the left-side dimension names are the same and both the left side and right side have coordinate variables for the same dimensions. In this case, assignment occurs for each coordinate variable. The subscripted left-side coordinate variable is assigned the subscripted right side coordinate. The attributes lists for the right side is merged with that of the left side and assigned to the left side variable. The following demonstrates this kind of variable-to-variable assignment.

If the left side variable does not have a coordinate variable and the right side does, a coordinate variable is created and assigned. If the left side is subscripted, then the created coordinate variable only has values assigned for the subscripted range, and the rest of the coordinate variable is filled with missing values. The following example illustrates this feature: The final situation that must be considered when assigning one variable to another is when the dimension names of the left side and the right side do not match. In this case, the assignment overrides the left side's dimension names and coordinate variables, and a warning message is generated. If this is not the desired effect, then the array designator characters '(/' and '/)' can be used to make the assignment a "value-only" assignment.

HLU object variables

HLU object variables are variables of type graphic. HLU object variables reference HLU objects that were either created with the create statement or were retrieved from an object with the getvalues statement. Arrays of HLU objects are supported by NCL. The same HLU functions that are available through C and Fortran are available from NCL to operate on HLU objects. See NCL versions of HLU functions and procedures for more information. The interfaces to many of the functions have been modified to support operations on one or more HLU object.

HLU object variables support assignment and all of the other properties of NCL variables. Also, HLU object variables can be compared with the .ne. and .eq. operators.

Files and file variables

Once again it is very important to understand the distinction between a variable that references a file (from now on simply called a file) and a file variable. When a file is opened with the addfile function, a reference to the file is assigned to a variable. The variable's data type is file. This variable is a variable that references a file. On the other hand, a file variable is a variable that is contained within a file. There are many ways to get information about file variables. Calling the procedure print with a variable that references a file as a parameter will produce a listing of all of the file's attributes, dimensions, variables, and coordinate variables. The procedure list_filevars produces a similar listing. The function getfilevarnames returns an array of strings that contains the string name of all the file variables in the file. filevardimsizes should always be used to determine the dimension sizes of a file variable. It is very important to use filevardimsizes since calling dimsizes may inadvertently force the entire file variable to be read into memory. getfilevaratts and getfilevardims are also useful functions.

Opening data files

Opening files is done with the addfile function. Addfile takes two parameters. The first is a UNIX pathname string, either relative or absolute, to the file; the second parameter is a string option. Currently there are three options for the second parameter, "w", "r", and "c". "w" means open the file with read/write permissions, "r" means open the file with read-only permissions, and "c" means create the file. "c" will return an error message if the file already exists. "w" will return an error message if the permissions of the file and/or directory are not correct.

The addfile function uses the file extension (i.e. ".nc" or ".cdf" for netCDF, ".hdf" for HDF, ".ccm" for CCM history files, and ".grb" for GRIB) to determine what type of file to open. Once open, all files and file variables regardless of type are referenced using the same NCL file reference syntax. For more information on what types of file formats are currently supported and special conventions of a specific file format, see the Supported data format information section of the reference guide.

Referencing file variables

The '->' operator is used to reference specific variables in a file. The following examples demonstrate referencing a file variable, a file variable attribute, and a file variable coordinate variable: Using the '->' operator requires the variable name to appear immediately after the '->'; parentheses and expressions are not allowed immediately after the '->'. A different kind of file variable access is available to reference variables by a string expression. This is covered later.

File variables function just like regular NCL variables with respect to what is outlined in the NCL properties of variables section. It is important to understand that only the section defined by the variable reference is read or written. This means that NCL allows direct access to file variables, the entire variable does not have to be read in to NCL at one time. For instance, if file1 contains a variable "elev" that is dimensioned [lat | 2159] x [lon | 4320], the selection file1->elev({20:60},{-135:-65}) will read in only the [lat | 480] x [lon | 840] subsection defined by the reference leaving the remaining 8923680 data points in the file.

File variable string references (Using '$' to reference file variables)

Sometimes it is convenient to write generic scripts that do depend on specific variable, attribute, or dimension names. This can be accomplished by using string variables containing the name of a variable rather than by hard-coding the names. Basically, string references work by putting a dollar sign '$' before and after the string variable. When NCL encounters this syntax, it uses the string value of the variable for the variable, attribute, or dimension name. There are several functions that are convenient to use with this feature. They are getfilevarnames, getfilevaratts, and getfilevardims.

The following is an example of how to copy a file from one format to another without knowing the names of the variables in the file.

The following are examples of referencing attributes and dimensions using string references: