Portable, parallel random number generator
for Monte Carlo applications

Charles Karney
Princeton Plasma Physics Laboratory

Abstract

In Monte Carlo applications, it is frequently useful to ensure that each ``flight'' is governed by an independent repeatable sequence of random numbers. This is necessary when implementing ``correlated sampling'' and when the Monte Carlo code is running on a parallel computer. We describe here a technique for implementing this with the following characteristics:

Each flight has access to a long private sequence of random numbers;
Portable-the same sequences are generated on different computers;
Can be coded in standard Fortran;
Fast-involves only 1-2 floating additions per random number;
Minimal communication requirements on a parallel computer.

Introduction

In a magnetic fusion reactor, heat and fusion ash are extracted in a ``divertor.'' Here the plasma is neutralized (either on a material surface or by a gas blanket). The design of a divertor requires understanding the interaction of plasmas with neutral atoms and molecules. Typically the transport of neutrals is modeled by Monte Carlo, because

neutrals have a long mean free path;
easy to handle complex geometries (plasma, pumping duct, etc.)
easy to handle multiple species, reactions.

Design of Monte Carlo code

We (Daren Stotler and Charles Karney) have written a Monte Carlo code, degas2 (https://w3.pppl.gov/degas2), to model the neutral transport. Amongst the design goals were:

runs efficiently on parallel machines on an distributed computers;
results should be independent of number of processors;
portable-including same results on different machines;
fast.

The parallelism is realized by regarding the calculation of each Monte Carlo flight as independent and assigning them to processes in an arbitrary manner (in practice this is done so as to achieve load balance).

For this reason we associate each flight with an independent sequence of random numbers

X_f,n,

where f is a flight index and n is the index into the sequence.

Strategy

Choose the subtractive generator (Knuth, Vol 2)

X_f,n = (X_f,n-k - X_f,n-j) mod 1.

See Knuth for a table of possible values of k and j. We choose k = 63, j = 31. This generator is interesting because:

It has a long period. The period of the least significant bit in this sequence is 2^k-1. The actual period is much longer than this.
With some mild restrictions, the floating point arithmetic involved is exact!

The second property holds when X is of the form

2⁴⁷

(0 Ł l < 2⁴⁷)

Single-precision arithmetic is exact on the Crays (48-bit mantissa) and as is double-precision arithmetic on IEEE-compliant machines.

This allows the basic random number sequence to be generated by the Fortran code

      x(n) = x(n-k) - x(n-j)
      if (x(n) < 0.0) x(n) = 1.0 + x(n)

In practice random numbers are generated in batches as needed and stored in an array which acts as a circular buffer.

Initializing the sequences

These sequences need to be initialized with k numbers. We use a conventional linear congruential random number generator for this purpose:

Y_i+1 = (a₁ Y_i + c₁) mod 2⁵⁶.

The period is 2⁵⁶. The arithmetic is carried out by multiple-precision arithmetic representing a 56-bit integer as an array of 4 14-bit integers. The high 47 bits of Y_fk thru Y_fk+k-1 are used to initialize the subtractive generator for flight f.

Given a₁ and c₁, it is easy to generate a_n and c_n which step the linear congruential sequence forward n steps

Y_i+n = (a_n Y_i + c_n) mod 2⁵⁶.

In practice, we pre-calculate a₆₃ and c₆₃ which are needed to step to the seed for flight f+1 given the seed for flight f. Given Y₀, a₆₃, and c₆₃, we can also calculate Y_63f in O(log(f)) steps by decomposing q in binary.

Communication requirements and timing considerations

The random numbers for a Monte Carlo now requires the specification of a 56-bit random seed (17 decimal digit) for the linear congruential sequence. This is represented internally by 4 14-digit integers and this needs to be communicated to each process.

For flight f, the linear congruential sequence is marched forward 63f steps in O(log(f)) operations. The subtractive generator is initialized with the next 63 members of the sequence which are calculated in O(63)) operations. All this is relatively slow multiple-precision integer arithmetic. But there isn't much of it.

Finally, individual random numbers for flight f can be calculated using the subtractive generator which takes only 1-2 floating point additions per random number.

File translated from T_EX by T_TH, version 1.58.

Portable, parallel random number generator for Monte Carlo applications

Charles Karney Princeton Plasma Physics Laboratory

Abstract

Portable, parallel random number generator
for Monte Carlo applications

Charles Karney
Princeton Plasma Physics Laboratory