subroutine yspace(kxa1,kxa4) c c Transform a variable from ky space into y space. c implicit none include 'itg.par' include 'itg.cmn' integer l,m,n,index,imode complex kxa1(nzz,mz*lz) complex kxa4(mzz/2,nzz*lz) c real kxa2(mzz, nzz*lz) c c The complex kxa4 can be considered equivalenced to the real kxa2. c The complex kxa4 is for ky space, the real kxa2 is for y space. c c We will use a complex-to-real FFT. malias points in real space c requires malias/2+1 complex Fourier coefficients. (This formula works c for even or odd malias, using standard fortran rules for integer c division. The k=0 coefficient will be purely real, and if malias is c even, the k=malias/2+1 coefficient will also be purely real...) c real scale integer isign scale=1.0 isign=-1 ! ?? Differs from usual isign=+1 for ky-->y transform?? c c Need to transpose the data from c kxa1(n,index) where index=l+(m-1)*ldb to c kxa4(m,imode) where imode=l+(n-1)*ldb c c First get the ky=0 (m=1) component: c do n=1,nalias c Tell the compiler to vectorize this and not the outer loop: cfpp$ select (vector) do l=1,ldb imode=l+(n-1)*ldb kxa4(1,imode)=kxa1(n,l) enddo enddo c c get the ky>0 components (factor of 1/2 from sin/cos versus complex c conventions...): c c GWH: rewrote the following loop to optimize better on the CRAY C-90. c The C-90 is most efficient if there are only linear functions of c the do-loop indices. Usually, only the innermost loop is vectorized, c but ldb is usually large enough (>=64) for this to be efficient. c The next outer loop is sometimes unrolled, and for large runs (md>6-8) c this should be efficient also. Finally, when compiled with -Zp, the c parallelization is usually applied to the outer most loop. Best c results come when the loop count exceeds the number of processors, c which is usually the ase for nalias... do n=1,nalias do m=2,md cfpp$ select (vector) do l=1,ldb imode=l+(n-1)*ldb index=l+(m-1)*ldb kxa4(m,imode)=kxa1(n,index)/2. enddo enddo enddo c c dealias for high ky: c do m=md+1,malias/2+1 cfpp$ select (vector) do imode=1,nalias*ldb kxa4(m,imode)=0.0 enddo enddo call csfftm(isign,malias,ldb*nalias,scale, & kxa4,mzz/2,kxa4,mzz,tabley,worky,0) return end