# Portable, parallel random number generator for Monte Carlo applications

## Abstract

In Monte Carlo applications, it is frequently useful to ensure that each ``flight'' is governed by an independent repeatable sequence of random numbers. This is necessary when implementing ``correlated sampling'' and when the Monte Carlo code is running on a parallel computer. We describe here a technique for implementing this with the following characteristics:

• Each flight has access to a long private sequence of random numbers;
• Portable-the same sequences are generated on different computers;
• Can be coded in standard Fortran;
• Fast-involves only 1-2 floating additions per random number;
• Minimal communication requirements on a parallel computer.

Introduction

In a magnetic fusion reactor, heat and fusion ash are extracted in a ``divertor.'' Here the plasma is neutralized (either on a material surface or by a gas blanket). The design of a divertor requires understanding the interaction of plasmas with neutral atoms and molecules. Typically the transport of neutrals is modeled by Monte Carlo, because

• neutrals have a long mean free path;
• easy to handle complex geometries (plasma, pumping duct, etc.)
• easy to handle multiple species, reactions.

Design of Monte Carlo code

We (Daren Stotler and Charles Karney) have written a Monte Carlo code, degas2 (https://w3.pppl.gov/degas2), to model the neutral transport. Amongst the design goals were:

• runs efficiently on parallel machines on an distributed computers;
• results should be independent of number of processors;
• portable-including same results on different machines;
• fast.

The parallelism is realized by regarding the calculation of each Monte Carlo flight as independent and assigning them to processes in an arbitrary manner (in practice this is done so as to achieve load balance).

For this reason we associate each flight with an independent sequence of random numbers

 Xf,n,
where f is a flight index and n is the index into the sequence.

Strategy

Choose the subtractive generator (Knuth, Vol 2)

 Xf,n = (Xf,n-k - Xf,n-j) mod 1.
See Knuth for a table of possible values of k and j. We choose k = 63, j = 31. This generator is interesting because:

• It has a long period. The period of the least significant bit in this sequence is 2k-1. The actual period is much longer than this.
• With some mild restrictions, the floating point arithmetic involved is exact!

The second property holds when X is of the form

 l 247 (0 £ l < 247)
Single-precision arithmetic is exact on the Crays (48-bit mantissa) and as is double-precision arithmetic on IEEE-compliant machines.

This allows the basic random number sequence to be generated by the Fortran code

```      x(n) = x(n-k) - x(n-j)
if (x(n) < 0.0) x(n) = 1.0 + x(n)
```
In practice random numbers are generated in batches as needed and stored in an array which acts as a circular buffer.

Initializing the sequences

These sequences need to be initialized with k numbers. We use a conventional linear congruential random number generator for this purpose:

 Yi+1 = (a1 Yi + c1) mod 256.
The period is 256. The arithmetic is carried out by multiple-precision arithmetic representing a 56-bit integer as an array of 4 14-bit integers. The high 47 bits of Yfk thru Yfk+k-1 are used to initialize the subtractive generator for flight f.

Given a1 and c1, it is easy to generate an and cn which step the linear congruential sequence forward n steps

 Yi+n = (an Yi + cn) mod 256.

In practice, we pre-calculate a63 and c63 which are needed to step to the seed for flight f+1 given the seed for flight f. Given Y0, a63, and c63, we can also calculate Y63f in O(log(f)) steps by decomposing q in binary.

Communication requirements and timing considerations

The random numbers for a Monte Carlo now requires the specification of a 56-bit random seed (17 decimal digit) for the linear congruential sequence. This is represented internally by 4 14-digit integers and this needs to be communicated to each process.

For flight f, the linear congruential sequence is marched forward 63f steps in O(log(f)) operations. The subtractive generator is initialized with the next 63 members of the sequence which are calculated in O(63)) operations. All this is relatively slow multiple-precision integer arithmetic. But there isn't much of it.

Finally, individual random numbers for flight f can be calculated using the subtractive generator which takes only 1-2 floating point additions per random number.

File translated from TEX by TTH, version 1.58.