Language elements

by Michael Metcalf / CERN CN-AS

The basic components of the Fortran language are its character set. The members are:

From these components, we build the tokens that have a syntactic meaning to the compiler. There are six classes of token:
Label:       123                  Constant: 123.456789_long
Keyword:     ALLOCATABLE          Operator: .add.
Name:        solve_equation (up to 31 characters, including _)
Separator:  /   (   )   (/   /)   ,   =   =>   :   ::   ;   %

From the tokens, we can build statements. These can be coded using the new free source form which does not require positioning in a rigid column structure:

FUNCTION string_concat(s1, s2)                ! This is a comment
   TYPE (string)      :: s1, s2, string_concat
   string_concat%string_data = s1%string_data(1:s1%length) // &
      s2%string_data(1:s2%length)             ! This is a continuation
   string_concat%length = s1%length + s2%length
END FUNCTION string_concat
Note the trailing comments and the trailing continuation mark. There may be 39 continuation lines, and 132 characters per line. Blanks are significant. Where a token or character constant is split across two lines:
               ...        start_of&
               ...   'a very long &
a leading & on the continued line is also required.

Automatic conversion of source form for existing programs can be carried out by convert.f90. Its options are:

Fortran has five intrinsic data types. For each there is a corresponding form of literal constant. For the three numeric intrinsic types they are: The forms of literal constants for the two non-numeric data types are:

The numeric types are based on model numbers with associated inquiry functions (whose values are independent of the values of their arguments):

     DIGITS(X)               Number of significant digits
     EPSILON(X)              Almost negligible compared to one (real)
     HUGE(X)                 Largest number
     MAXEXPONENT(X)          Maximum model exponent (real)
     MINEXPONENT(X)          Minimum model exponent (real)
     PRECISION(X)            Decimal precision (real, complex)
     RADIX(X)                Base of the model
     RANGE(X)                Decimal exponent range
     TINY(X)                 Smallest postive number (real)
These functions are important for portable numerical software.

We can specify scalar variables corresponding to the five intrinsic types:

        INTEGER(KIND=2) i
        REAL(KIND=long) a
        COMPLEX         current
        LOGICAL         Pravda
        CHARACTER(LEN=20) word
        CHARACTER(LEN=2, KIND=Kanji) kanji_word
where the optional KIND parameter specifies a non-default kind, and the LEN= specifier replaces the *len form. The explicit KIND and LEN specifiers are optional:
        CHARACTER(2, Kanji) kanji_word
works just as well.

For derived-data types we must first define the form of the type:

        TYPE person
           CHARACTER(10) name
           REAL          age
        END TYPE person
and then create structures of that type:
        TYPE(person) you, me
To select components of a derived type, we use the % qualifier:
and the form of a literal constant of a derived type is shown by:
        you = person('Smith', 23.5)
which is known as a structure constructor. Definitions may refer to a previously defined type:
        TYPE point
           REAL x, y
        END TYPE point
        TYPE triangle
           TYPE(point) a, b, c
        END TYPE triangle
and for a variable of type triangle, as in
        TYPE(triangle) t
we have components of type point:
        t%a   t%b   t%c
which, in turn, have ultimate components of type real:
        t%a%x   t%a%y   t%b%x   etc.
We note that the % qualifier was chosen rather than . because of ambiguity difficulties.

Arrays are considered to be variables in their own right. Given

        REAL a(10)
        INTEGER, DIMENSION(0:100, -50:50) :: map
(the latter an example of the syntax that allows grouping of attributes to the left of :: and of variables sharing the attributes to the right), we have two arrays whose elements are in array element order (column major), but not necessarily in contiguous storage. Elements are, for example,
        a(1)               a(i*j)
and are scalars. The subscripts may be any scalar integer expression. Sections are
        a(i:j)               ! rank one
        map(i:j, k:l:m)      ! rank two
        a(map(i, k:l))       ! vector subscript
        a(3:2)               ! zero length
Whole arrays and array sections are array-valued objects. Array-valued constants (constructors) are available:
        (/ 1, 2, 3, 4, 5 /)
        (/ (i, i = 1, 9, 2) /)
        (/ ( (/ 1, 2, 3 /), i = 1, 10) /)
        (/ (0, i = 1, 100) /)
        (/ (0.1*i, i = 1, 10) /)
making use of the implied-DO loop notation familiar from I/O lists. A derived data type may, of course, contain array components:
        TYPE triplet
           REAL, DIMENSION(3) :: vertex
        END TYPE triplet
        TYPE(triplet), DIMENSION(4) :: t
so that
       t(2)           is a scalar (a structure)
       t(2)%vertex    is an array component of a scalar

There are some other interesting character extensions. Just as a substring as in

        CHARACTER(80), DIMENSION(60) :: page
        ... = page(j)(i:i)         ! substring
was already possible, so now are the substrings
Also, zero-length strings are allowed:
        page(j)(i:i-1)       ! zero-length string
Finally, there are some new intrinsic character functions:
      ACHAR                 IACHAR  (for ASCII set)
      ADJUSTL               ADJUSTR
      LEN_TRIM              INDEX(s1, s2, BACK=.TRUE.)
      REPEAT                SCAN  (for one of a set)
      TRIM                  VERIFY(for all of a set)