************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./m3dp_fsymm_opt_half_size.x on a cray-xt3_ named . with 64 processors, by Unknown Wed Dec 20 13:15:42 2006 Using Petsc Release Version 2.3.0, Patch 21, April, 26, 2005 Max Max/Min Avg Total Time (sec): 1.786e+03 1.00006 1.786e+03 Objects: 2.234e+03 1.00000 2.234e+03 Flops: 8.059e+10 1.01625 7.997e+10 5.118e+12 Flops/sec: 4.513e+07 1.01625 4.479e+07 2.867e+09 MPI Messages: 3.167e+06 5.00329 7.915e+05 5.065e+07 MPI Message Lengths: 1.053e+09 1.53064 1.300e+03 6.587e+10 MPI Reductions: 3.217e+03 1.00272 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4424e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Starting: 2.6677e+00 0.1% 2.3734e+08 0.0% 6.738e+04 0.1% 2.043e+00 0.2% 5.900e+02 0.3% 2: TimeStepping: 1.7829e+03 99.9% 5.1180e+12 100.0% 5.059e+07 99.9% 1.298e+03 99.8% 2.051e+05 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: Starting VecScale 16 1.0 3.7956e-04 1.0 6.50e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 6 0 0 0 39963 VecSet 33 1.0 1.9917e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 56 1.0 9.7589e-02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+02 0 0 0 0 0 1 0 0 0 28 0 VecAssemblyEnd 56 1.0 1.9979e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 20 1.0 1.6038e-03 1.3 3.18e+08 1.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 11 0 0 0 15763 VecScatterBegin 212 1.0 3.0970e-02 5.8 0.00e+00 0.0 5.1e+04 1.3e+03 0.0e+00 0 0 0 0 0 0 0 76 64 0 0 VecScatterEnd 212 1.0 8.3436e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatMult 12 1.0 1.5713e-02 1.2 2.39e+08 1.2 2.9e+03 1.3e+03 0.0e+00 0 0 0 0 0 1 83 4 4 0 12530 MatConvert 25 1.0 8.6287e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0 MatAssemblyBegin 52 1.0 3.3599e-02 2.3 0.00e+00 0.0 8.6e+03 9.3e+00 1.0e+02 0 0 0 0 0 1 0 13 0 18 0 MatAssemblyEnd 52 1.0 8.3606e-02 1.1 0.00e+00 0.0 6.7e+03 7.9e+02 2.2e+02 0 0 0 0 0 3 0 10 5 37 0 MatGetRow 26502650.0 4.1015e-0351.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 2: TimeStepping M3d Solver 1307 1.0 1.4126e+03 1.2 2.42e+07 1.3 1.6e+07 1.3e+03 1.9e+05 71 34 32 32 93 72 34 32 32 94 1245 M3d Par 61551 1.0 1.4613e+02 1.6 0.00e+00 0.0 3.7e+07 1.3e+03 0.0e+00 7 0 73 73 0 7 0 73 73 0 0 VecMax 444 1.0 5.9355e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 0 0 0 0 0 0 0 VecMin 16 1.0 2.2533e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 VecDot 105100 1.0 3.3162e+01 2.8 3.47e+08 2.8 0.0e+00 0.0e+00 1.0e+05 1 5 0 0 51 1 5 0 0 51 8000 VecNorm 54760 1.0 1.4450e+01 1.2 1.74e+08 1.2 0.0e+00 0.0e+00 5.5e+04 1 3 0 0 27 1 3 0 0 27 9566 VecScale 10092 1.0 9.2279e-01 1.4 2.47e+08 1.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11545 VecCopy 42958 1.0 5.9137e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 155806 1.0 1.1429e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 191408 1.0 1.6285e+01 1.2 5.89e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 1 10 0 0 0 31688 VecAYPX 44317 1.0 4.3031e+00 2.3 9.13e+08 2.2 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 25989 VecAssemblyBegin 6440 1.0 3.4330e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+04 0 0 0 0 9 0 0 0 0 9 0 VecAssemblyEnd 6440 1.0 3.8995e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 28804 1.0 4.0005e+00 1.2 1.72e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9086 VecScatterBegin 210860 1.0 2.5776e+01 4.7 0.00e+00 0.0 5.1e+07 1.3e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 210860 1.0 5.1924e+01 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatMult 53160 1.0 7.6464e+01 1.4 2.53e+08 1.4 1.3e+07 1.3e+03 0.0e+00 3 17 25 25 0 3 17 25 25 0 11389 MatMultAdd 4028 1.0 6.8539e+00 1.5 2.38e+08 1.5 9.7e+05 1.3e+03 0.0e+00 0 1 2 2 0 0 1 2 2 0 10385 MatCopy 5228 1.0 7.1798e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 5228 1.0 4.1196e+00 1.4 2.40e+08 1.4 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 11213 MatAssemblyBegin 4008 1.0 5.6311e+00 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+03 0 0 0 0 4 0 0 0 0 4 0 MatAssemblyEnd 4008 1.0 2.4711e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+03 0 0 0 0 2 0 0 0 0 2 0 MatGetRow 397720 1.0 1.1346e+00-1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 32 1.0 5.4462e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 5228 1.0 1.3655e+03 1.2 2.29e+07 1.3 1.2e+07 1.3e+03 1.6e+05 69 31 23 23 77 69 31 23 23 77 1174 PCSetUp 32 1.0 1.0051e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 6 0 0 0 0 6 0 0 0 0 0 PCApply 54760 1.0 1.1520e+03 1.3 4.90e+05 1.3 0.0e+00 0.0e+00 1.2e+01 58 1 0 0 0 58 1 0 0 0 24 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage --- Event Stage 1: Starting Index Set 66 64 72672 0 Map 856 129 40248 0 Vec 767 29 4615408 0 Vec Scatter 57 0 0 0 IS Local to global mapping 3 1 161920 0 Application Order 1 0 0 0 Matrix 212 50 0 0 Krylov Solver 56 0 0 0 Preconditioner 56 0 0 0 --- Event Stage 2: TimeStepping Map 32 20 6240 0 Vec 128 20 3183040 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 7.76291e-05 Average time for zero size MPI_Send(): 8.28132e-06 Compiled without FORTRAN kernels Compiled with double precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 Configure run at: Fri Oct 7 16:29:43 2005 Configure options: --with-memcmp-ok --sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-batch=1 --with-shared=0 --with-cc="cc --target=catamount" --with-cxx="CC --target=catamount" --with-fc="ftn --target=catamount" --with-blas-lib=acml --with-lapack-lib=acml --with-debugging=0 --COPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --CXXOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" --FOPTFLAGS=" -fastsse -O3 -Munroll=c:4 -tp k8-64" -PETSC_ARCH=cray-xt3_fast ----------------------------------------- Libraries compiled on Thu Aug 24 15:02:33 EDT 2006 on jaguar8 Machine characteristics: Linux jaguar8 2.6.5-7.252-ss #6 Mon Jul 31 18:05:34 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /apps/PETSC/petsc-2.3.0 Using PETSc arch: cray-xt3_fast ----------------------------------------- Using C compiler: cc --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 Using Fortran compiler: ftn --target=catamount -fastsse -O3 -Munroll=c:4 -tp k8-64 ----------------------------------------- Using include paths: -I/apps/PETSC/petsc-2.3.0 -I/apps/PETSC/petsc-2.3.0/bmake/cray-xt3_fast -I/apps/PETSC/petsc-2.3.0/include ------------------------------------------ Using C linker: cc --target=catamount -O2 Using Fortran linker: ftn --target=catamount -O2 Using libraries: -Wl,-rpath,/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -L/apps/PETSC/petsc-2.3.0/lib/cray-xt3_fast -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -L/spin/apps/HYPRE/hypre-1.10.0b/cray-xt3/lib -lHYPRE_DistributedMatrix -lHYPRE_DistributedMatrixPilutSolver -lHYPRE_Euclid -lHYPRE_IJ_mv -lHYPRE_LSI -lHYPRE_MatrixMatrix -lHYPRE_ParaSails -lHYPRE_krylov -lHYPRE_parcsr_ls -lHYPRE_parcsr_mv -lHYPRE_seq_mv -lHYPRE_sstruct_ls -lHYPRE_sstruct_mv -lHYPRE_struct_ls -lHYPRE_struct_mv -lrt -lacml -lacml -L/opt/acml/3.0/pgi64/lib/cray/cnos64 -L/opt/xt-mpt/default/mpich2-64/P2/lib -L/opt/acml/3.0/pgi64/lib -L/opt/xt-libsci/default/pgi/cnos64/lib -L/opt/xt-mpt/default/sma/lib -L/opt/xt-lustre-ss/default/lib64 -L/opt/xt-catamount/default/lib/cnos64 -L/opt/xt-pe/default/lib/cnos64 -L/opt/xt-libc/default/amd64/lib -L/opt/xt-os/default/lib/cnos64 -L/opt/xt-service/default/lib/cnos64 -L/opt/pgi/default/linux86-64/default/lib -L/opt/gcc/3.2.3/lib/gcc-lib/x86_64-suse-linux/3.2.3/ -llapacktimers -lsci -lmpichf90 -lmpich -lacml -llustre -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgc -lm -lcatamount -lsysio -lportals -lC -lcrtend ------------------------------------------