Numerical Algorithms used in GS2 ******************************** Greg Hammett, Aug. 25, 1999 The main numerical algorithms used in GS2 are described in the KRT paper (Kotschenreuther, Rewoldt, Tang, CPC 88, 128 (1995)). Of particular interest is the implicit method used to solve the gyrokinetic equation. The linear gyrokinetic equation is an integro-differential equation involving f(E,xi,theta,s), the distribution function as a function of energy, pitch angle, theta, and species index s, with integrals of f involved in the gyrokinetic Maxwell's equations for the fields. Kotschenreuther's algorithm is fully implicit, for both f and the fields (which are related to integrals of f). A brute force inversion of the resulting Ntot x Ntot matrix, where Ntot = (N_E * N_xi * N_s + N_fields) *N_theta would involve Ntot**3 operations for the first inversion, and Ntot**2 operations for the subsequent solutions. Kotschenreuther's algorithm makes use of some properties of the equations to break up the inversion into two parts, one is effectively an implicit solver for the fields and the other for f (there are no approximations here, the solution is still fully implicit). The resulting compute time should scale as (N_fields*N_theta)**3 for the initial inversion of some matrices, and C1*(N_fields*N_theta)**2 + C2 * (N_E * N_xi * N_theta * N_s) per time step for each subsequent time step, where N_fields is the number of fields (1 for electrostatic, 3 for fully electromagnetic), N_theta is the total number of theta grid points (ntheta*(2*nperiod-1) in terms of the namelist input parameters), and C1 and C2 are constants. [I'm not sure if the code actually follow the N_fields scaling or not...] This method of breaking up the inversions leads to a very fast implicit algorithm, and is a pretty neat trick if you ask me... ************************************************************ A general implicit finite difference time advancement scheme for df/dt = - L f where L is some linear operator (in our case, f is equivalent to a vector and L to a big matrix), is (f(i+1) - f(i))/dt = - L ( delta * f(i+1) + (1-delta * f(i) ) this is like Eqs. 11-14 in the KRT paper, where delta=0 is fully explicit, delta=1 is fully implicit, and delta=0.5 is a standard time-centered implicit, 2cd order accurate solution. GS2 allows different value of delta for different species (though I've only thought about a constant delta for all species). [GS2 also allows delta to be complex, though D&K can't remember why complex values were explored once, and we will asume it is real here.] In terms of the namelist input values for fexpr and fexpi: delta = 1 - (fexpr + i * fexpi) (note that fexpr=fexpi=0 corresponds to fully implicit, and fexpr=0.5 corresponds to a standard time-centered implicit method, and fexpr=.4 corresponds to the slightly off-centered value typically used in the KRT paper). For a fully implicit delta=1, f(i+1) = 1/(1+ dt * L) f(i) We can break up f into eigenvectors of L, and consider each eigenvector independently. Each eigenvector will satisfy f(i+1) = 1/(1+ dt * i * omega_L) f(i) where i*omega_L is the corresponding eigenvalue. This is numerically stable for all dt, for real or damped omega_L, though there can be a numerical instability if omega_L corresponds to a real instability and dt*i*omega_L is close to -1, i.e. if dt is not small enough to resolve the unstable modes. For delta=0.5, the amplification factor will be: [1 - dt * i * omega_L /2 ] A = f(i+1)/f(i) = -------------------------- [1 + dt * i * omega_L /2 ] Which is again numerically stable for all dt (as long as dt is small enough to resolve any growing modes). An advantage of centered schemes is that for real omega_L, the magnitude of A remains 1 exactly, and no artificial damping is introduced. Note however that for extremely damped modes, with omega_L -> - i*infinity, the amplification factor A will approach -1 and the solution will just oscillate (though without amplification, so any unstable modes will eventually dominate). For general delta, the algorithm remains stable for all dt as long as delta >= 0.5 (and again, as long as dt is small enough to resolve any growing modes). The fastest growing mode will eventually dominate (assuming dt*omega_L is small enough that the theoretical fastest growing mode remains the numerical fastest growing mode). We recognize the solution for f(i+1) as a Taylor series expansion of the exact solution: f(t(i+1)) = f(t(i)) exp(-i*omega_L*t) Depending on exactly how the code measures the frequency (from numerically evaluating (df/dt)/f), it should be able to measure the frequency omega_L with an accuracy of |dt*omega_L| or better. The advantage of this implicit method is that while |dt*omega_L| needs to be small to measure the fastest growing mode accurately, there can be other eigenmodes with high frequencies or very strongly damped with |dt*omega_L| >> 1 that we don't have to resolve with a small dt. For a time-centered implicit method (delta=0.5), one only needs dt*Re(omega)<0.5 and dt*Im(omega)<0.5 in order to determine growth rates to 5% accuracy (assuming that the fastest growing mode is dominant). One cautionary example is the following. Consider a case where there is an unstable ITG mode at low Re(omega), and a slightly faster growing ballooning mode at high Re(omega). The initial value code should select out the fastest growing mode, but if the time step is such that dt*Re(omega) is large for the ballooning mode, then the implicitness will reduce the effective numerical growth rate of that mode and make the ITG mode appear to be faster growing... ?? Someone could do a little analysis here to check the criterion a little more accurately, and see if the (df/dt)/f measurement can be translated numerically depending on the value of delta to be virtually exact. ************************************************************ The bakdif parameter in the namelist controls upwind differencing. Leave it at zero to get the second-order spatial differencing method used in the KRT paper. The bakdif parameter is a second "implicit" parameter in some sense, and was added after the KRT paper was written. Dorland and Hammett thought a little bit about it in spring of 1999. bakdif = 0.0 ! no additional upwinding-type implicitness bakdif = 1.0 ! fully "upwinded" spatial differencing Dorland writes: "In practice, MK and I almost always use bakdif = 0. For some nonlinear runs, I have used bakdif = 0.1, which is small enough that the RH effect is fully recovered for the grids that I was using. Early on in the nonlinear development, I used bakdif = 0.5 in order to try to get some nonlinear runs to go through robustly so I could check other assumptions/algorithmic issues. The bakdif scheme is not exactly upwind differencing, but it is pretty close to it -- about as close as the implicit scheme is to textbook implicit schemes." ************************************************************ GS2 uses Gaussian integration rules to try to get higher accuracy. The possible values of ngauss and negrid at present are: ngauss=10 ! # of passing pitch angles. ngauss=1-6, 8, 10, or 12 ! # of trapped particle pitch angles = ntheta/2 negrid=8 ! # of energy grid points. negrid = 3-12, 14, or 18. The above restricted set of possibilities for the values of ngauss and negrid are because Gaussian integration tables have been entered only for those values. A straightforward upgrade to the code could allow arbitrary values of ngauss and negrid, perhaps using second-order accurate integration for simplicity. Or such an upgrade could still use Gaussian integration, and employ a subroutine to evalute the weights for an arbitrary number of points. But far above marginal stability probably doesn't require much velocity space resolution. And even a small amount of collisions will probably reduced the needed velocity resolution a lot.