1 SupraLU SupraLU is a Fortran-90 interface to the SuperLU sparse matrix solver, written in an object-oriented style. The SuperLU library is written in the C programming language. For more information see its website: http://crd.lbl.gov/~xiaoye/SuperLU ... At present, only the serial version of the library is linked through this interface. Original author: A. Pletzer, Dec. 2003. Modifications and documentation: D. McCune, Feb. 2004 SupraLU is being made available as a module in the NTCC Modules Library, a code sharing projected funded by the US DOE Office of Fusion Energy Sciences. The Modules library is at http://w3.pppl.gov/NTCC ... Contact: D. McCune, dmccune@pppl.gov Copyright -- Princeton University -- PPPL 2 Introduction SuperLU is a collection of 3 ANSI C subroutine libraries for solving sparse linear systems of equations AX = B, where A is a square, non- singular, n x n sparse matrix, and X and B are dense n x n_rhs matrices, where n_rhs is the number of right-hand sides and solution vectors. (ref: User Guide: see http://crd.lbl.gov/~xiaoye/SuperLU ... ). SuperLU comes in a serial flavor and two parallel flavors: Posix- threaded and MPI-distributed memory. At present the Fortran interface SupraLU is restricted to the serial SuperLU library, and, ==> n_rhs = 1 <== is assumed, i.e. the context is solutions for sparse A of linear systems Ax = b, where x and b are vectors of length n. Future Enhancements to SupraLU will depend on indications of user interest. Two data type/precisions are supported: 64 bit real (real*8) 128 bit complex (complex*16) The Fortran-90 program making use of SupraLU will use a public module which defines two data types: sparseR8_obj -- real*8 sparse matrix data type sparseC16_obj -- complex*16 sparse matrix data type instances of which can be declared in the programs using the module. Except for a debugging control (memory leak detection, described below), the module contains no data: just the data type definitions, and the methods for manipulating instances of these data types. These methods allow the following manipulations of sparse matrix objects: -- initialization, allocation, deallocation. -- insertion/update of data. -- copying; basic comparison of shapes & sparsity patterns. -- solution of Ax=b, where A is an n x n sparse matrix and x and b are n-vectors. -- calculation of det(A) -- (sparseC16_obj only): iterative eigenvector solver. -- save object in a file (NetCDF format). -- load object from a file (NetCDF format). -- error reporting. Typical method calls have the form: for real*8 sparse matrix data types: call sparseR8_(MR8, [action-dependent arguments...], ier) where: "MR8" is an instance of sparseR8_obj -- a real*8 sparse matrix, "[action-dependent arguments...]" specify details of the action-- e.g. a filename, for a save or load operation. "ier" is a status return code: 0=OK. For non-zero values, call sparseR8_error(MR8, 6, ier) writes the appropriate error message on fortran unit 6. for complex*16 sparse matrix data types: call sparseC16_(MC16, [action-dependent arguments...], ier) where: "MC16" is an instance of sparseR8_obj -- a complex*16 sparse matrix, "[action-dependent arguments...]" specify details of the action-- e.g. a filename, for a save or load operation. "ier" is a status return code: 0=OK. For non-zero values, call sparseR8_error(MC16, 6, ier) writes the appropriate error message on fortran unit 6. Each sparse matrix object contains: matrix elements: row indices, in ascending order, grouped column by column in ascending order of column index, along with the corresponding matrix element values. rank (n) of matrix. size of sparse matrix -- i.e. number of non-zero elements. base -- =1 for fortran-style indices (1:n) =0 for C-style indices (0:n-1) state information: (a) whether the matrix object been initialized; (b) whether the matrix object size and rank have been set and data arrays allocated; (c) whether corresponding C-based SuperLU matrix factorization object exists; address of C object; (d) whether SuperLU column permutation has been calculated; (e) whether matrix LU factorization has been calculated. SuperLU control: column permutation option. 2 debugging_method To generate a trace marking allocation and deallocation of SupraLU fortran sparse matrix objects and corresponding SuperLU solution objects: call supralu_memtrace(LUN) integer, intent(in) :: lun ! Fortran LUN of open file on which to write ! trace messages, or, 0 to disable feature. Over the course of execution of a program: for each allocation there should be a deallocation; if not, a memory leak may be present. 2 C_layer Supralu contains a fortran-callable C layer positioned between the fortran methods and the SuperLU C code itself. These routines are not documented here, but are present in the module source code: dsupralu_meth.c -- R8 C layer zsupralu_meth.c -- C16 C layer The C codes are compiled with a cpp macro F77NAME, so that the C routine names are exported to ld with the "decoration" -- usually a single trailing underscore -- wanted by the fortran compiler. F77NAME comes from a suite of NTCC portability tools and is supplied in the NTCC source code distribution-- look for the f77name.h header file. 2 Dependencies: SuperLU -- http://crd.lbl.gov/~xiaoye/SuperLU (makes use of blas linear algebra library) EzCDF -- http://w3.pppl.gov/NTCC (EzCDF module in the catalog). NetCDF -- http://www.unidata.ucar.edu/packages/netcdf 2 R8_methods Summary: use supralu_mod ! define sparseR8_obj and sparseC16_obj & methods... type(sparseR8_obj) :: MR8, MR8b ! ier is a status return code (integer): 0=OK ! MR8, MR8b are instances of SparseR8_obj matrices. call sparseR8_error(MR8,lun,ier) ! write error message ! to fortran unit (lun). call sparseR8_init(MR8) ! initialize object as unallocated. call sparseR8_alloc(MR8, rank, size, base, ier) ! allocate object ! rank = matrix rank, size = no. of non- ! zero elements, base=0 or 1 to indicate ! C- or Fortran-style element indexing. call sparseR8_get_permc(MR8, ipermc, ier) ! fetch permutation option call sparseR8_set_permc(MR8, ipermc, ier) ! set permutation option call sparseR8_set_next(MR8, irow, newcol_flag, value, ier) ! define matrix element, one element at a time call sparseR8_set_next_col(MR8, nrow, irow_vec, value_vec, ier) ! define next column of matrix elements ! irow_vec must be in ascending order call sparseR8_update_element(MR8, irow, jcol, value, ier) ! update element or define for first time. If ! the latter, blocks of calls must be given in ! order of ascending (jcol) values, with (irow) ! in strict ascending order within each block. call sparseR8_update_column(MR8, nrow, irow_vec, value_vec, jcol, ier) ! update a column of elements or define for ! the first time. If the latter, calls must ! be made in ascending order of (jcol). ! irow_vec must be in ascending order call sparseR8_get_size(MR8, nrows, ncols, ier) ! return no. of rows & no. of columns ! defined so far. call sparseR8_copy(MR8, MR8b, ier) ! copy MR8 to MR8b ! MR8 should contain data; MR8b should be an ! initialized object. C-object factorization ! is *not* copied. call sparseR8_dmark(MR8, ier) ! mark that element values of MR8 have ! changed (matrix needs to be re-factored). ! used if MR8 element values are directly ! manipulated by user code. call sparseR8_compare_shapes(MR8, MR8b, ier) ! compare sparsity patterns ! set ier if shapes / sparsity patterns differ. call sparseR8_save(MR8, filename, ier) ! save MR8 to NetCDF file. call sparseR8_load(MR8, filename, ier) ! restore MR8 from NetCDF file. !------------------------------ ! computational methods: call sparseR8_solve(MR8, rhs, ier) ! **solve** MR8*x = b ! rhs is replaced with the solution on output call sparseR8_A_dot_x(MR8, x, res, ier) ! matrix-vector dot product ! MR8.x -> res ! MR8: n x n sparse matrix; x & res: n-vectors. call sparseR8_det(MR8, res_mantissa, res_exponent, ier) ! determinant ! det(MR8) = res_mantissa * 2**res_exponent ! beware of overflow potential !------------------------------ call sparseR8_free(MR8, ier) ! deallocate object, free memory. 3 remarks A. The reason that sparseR8_init and sparseR8_alloc are separate calls is due to a quirk that distinguishes Fortran-90 from "true object oriented languages" like C++: there is NO automatic constructor; the state of a declared sparseR8_obj object in a program is initially UNDEFINED! sparseR8_init puts the object in a defined but unallocated state. SparseR8_alloc requires the object state to be defined so that it can take appropriate action. For example, if SparseR8_alloc is called on an object that is already allocated, its old memory is freed via a SparseR8_free call, before new memory is allocated. BUT if sparseR8_init is called on an already-allocated object, the associated memory is lost (a memory leak occurs)! Rules of thumb: 1. sparseR8_init ... an object only once, before first use. If there is no prior sparseR8_init call, any supraLU call will post an error code. 2. sparseR8_alloc ... as often as desired. A sparseR8_free call of any prior allocation will be generated. 3. sparseR8_free ... after last use of object, or if the object data is no longer needed. If an object is local to a subroutine, sparseR8_free it before exiting! This is not automatic! Take care to avoid memory leaks: there are no automatic destructors in Fortran-90! 4. sparseR8_free ... also frees associated C factorization data. Even if Fortran-2000 someday introduces automatic destructors for Fortran objects, the fact that sparseR8_obj is a *mixed language* object means sparseR8_free calls will still be needed. 5. after sparseR8_free, an object is still in a known state, so, for any object instance, extra sparseR8_init calls after the initial sparseR8_init call are never required, and should be avoided due to the memory leak hazard. B. Copy and save/load operations only copy the fortran object contents; if the object copied from or saved contained an associated C object with factorization data, this information is *not* passed on to the object copied to or loaded from disk. This has performance implications. Overcoming this limitation is a possible future upgrade for this software. C. Factorization state does survive within individual objects. Thus if matrix A is factored due to a sparseR8_solve or sparseR8_det call, the factorization information is retained. If multiple A*x = b problems are to be solved with no change to A, the computationally expensive LU factorization of A is only evaluated once; subsequent solutions of A*x = b are much faster than the initial solution. D. If the rank or sparsity pattern (shape) of a matrix changes, it should generally be re-allocated and reconstructed in its entirety. If the shape does not change, but matrix element values do change, the matrix will need to be re-factored, but, a costly column permutation calculation will not need to be redone. When sparseR8_update_column or sparseR8_update_element methods are applied to modify elements in an existing matrix, the object state is reset to indicate LU factorization pending but with column permutation still intact. E. Although normally only non-zero elements are specified when building a sparse matrix, it is allowed for some elements to be zero as long as the whole matrix is not rendered singular. This allows "place-holders" to be inserted in a sparse matrix structure, so that a later update allows select new non-zero elements to be introduced. However, having a large number of zero-value place holders could degrade the efficiency of the LU factorization calculation, and is not recommended. 3 Code_example The source code for the supralu_mod modules (supralu_mod.f90) contains its own test methods, sparseR8_test and sparseC16_test; more extensive code examples can be found by examining this code. A shortened example is given here... program foo use supralu_mod implicit NONE type(sparseR8_obj) :: MR8 integer :: ier integer rank, size, base integer i integer, parameter :: r8 = selected_real_kind(12,100) real(r8), dimension(:), allocatable :: x, rhs, res call sparseR8_init(MR8) rank = 5 ! n=5, n x n matrix size = 11 ! no. of nominal non-zero elements in sparse matrix base = 0 ! C-like indexing (0:n-1) for row/column indices call sparseR8_alloc(MR8, rank, size, base, ier) if(ier.ne.0) call sparseR8_error(MR8, 6, ier) ! matrix to store: MR8 (rhs) ! [ 0 2 x 7 x ] [x(1)] [1] ! [ 1 3 x x x ] MR8*[x(2)] = [2] ! [ x x 4 x x ] [x(3)] [3] ! [ x x 5 8 0 ] [x(4)] [4] ! [ x x 6 x 9 ] [x(5)] [5] ... will be solved. ! call sparseR8_update_column(MR8, nr, irow(1:nr), val(1:nr), jcol, ier) ! new matrix: columns must be presented in left to right order! call sparseR8_update_column(MR8, 2, (/0, 1/), (/0.0_r8, 1.0_r8/), 0, ier) call sparseR8_update_column(MR8, 2, (/0, 1/), (/2.0_r8, 3.0_r8/), 1, ier) call sparseR8_update_column(MR8, 2, (/2, 3, 4/), (/4.0_r8, 5.0_r8, 6.0_r8/),& & 2, ier) call sparseR8_update_column(MR8, 2, (/0, 3/), (/7.0_r8, 8.0_r8/), 3, ier) call sparseR8_update_column(MR8, 2, (/3, 4/), (/0.0_r8, 9.0_r8/), 4, ier) ! if base=1 had been used, the integer values in the irow(:) and jcol ! arguments would have had fortran-style index values (1:5) instead of ! (0:4)... i.e. adding +1 to each value. ! now form, solve, and check results of Ax=b problem... allocate(x(rank), rhs(rank), res(rank)) rhs = (/ (i, i=1,rank) /) x = rhs call sparseR8_solve(MR8, x, ier) if(ier.ne.0) call sparseR8_error(MR8, 6, ier) call sparseR8_A_dot_x(MR8, x, res, ier) write(6,*) ' measure of error in solution: ', & sqrt(abs(sum((res - rhs)**2)/rank)) call sparseR8_free(MR8, ier) deallocate(x, rhs, res) end program foo 3 Element_indexing:base When sparse matrix objects are allocated, a "base" argument is specified with value zero (0) or one (1). => 0 means: element indexing C-style -- i.e. indices run from 0 to n-1 for a rank n matrix. => 1 means: element indexing fortran-style -- i.e. indices run from 1 to n for a rank n matrix. 3 SparseR8_init subroutine sparseR8_init(this) ! most basic initialization of sparseR8_obj -- put pointers in ! defined NULL state. implicit none type(sparseR8_obj) :: this 3 SparseR8_alloc Note: rank & size are actually upper bounds; after allocation, smaller matrices may be defined. subroutine sparseR8_alloc(this, rank, size, base, ier) ! allocate memory for sparse matrix object. Object must have ! been initialized. If memory was already allocated, it is ! freed first. implicit none type(sparseR8_obj) :: this integer, intent(in) :: rank ! rank (n) of sparse matrix integer, intent(in) :: size ! no. of non-zero elements << n**2 integer, intent(in) :: base ! =0: C-indexing; =1: Fortran-indexing integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_get_permc column permutation option defined for SuperLU. Default should be used by non-experts. subroutine sparseR8_get_permc(this, ipermc, ier) implicit NONE type(sparseR8_obj) :: this integer, intent(out) :: ipermc ! permutation control (returned) integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_set_permc column permutation option defined for SuperLU. Default should be used by non-experts. subroutine sparseR8_set_permc(this, ipermc, ier) implicit NONE type(sparseR8_obj) :: this integer, intent(in) :: ipermc ! permutation control (set) integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_set_next see note on element indexing. subroutine sparseR8_set_next(this, row, newcol_flag, value, ier) ! Set next value. Must iterate down each column ! running from left to right. implicit none type(sparseR8_obj) :: this integer, intent(in) :: row ! element row index logical, intent(in) :: newcol_flag ! .TRUE. if new column, else .FALSE. real(r8), intent(in) :: value ! element value integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_set_next_col see note on element indexing. subroutine sparseR8_set_next_col(this, nrow, rowvec, valvec, ier) ! Set next column of values. Must iterate over columns ! running from left to right. implicit none type(sparseR8_obj) :: this integer, intent(in) :: nrow ! no. of non-zero row elements integer, intent(in) :: rowvec(nrow) ! element indices (ascending order) real(r8), intent(in) :: valvec(nrow) ! element values integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_update_element see note on element indexing. subroutine sparseR8_update_element(this, irow, jcol, value, ier) ! update specified element of sparse matrix ! if matrix has been factored, only an existing element can be ! updated; if the matrix is new, new elements may be added, but ! *only* in increasing order of row index within increasing order ! of column index. implicit none type(sparseR8_obj) :: this integer, intent(in) :: irow ! element row index integer, intent(in) :: jcol ! element column index real(r8), intent(in) :: value ! element value integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_update_column see note on element indexing. subroutine sparseR8_update_column(this, nrow, irows, vals, jcol, ier) ! update elements in a column. ! if this is an update to elements of an existing column in a matrix ! that has already been defined, then, the rows specified must match ! exactly the sparsity pattern in the matrix as it was originally ! defined. ! if the matrix is new, then, columns must be presented in ascending ! order, i.e. moving from left to right. implicit none type(sparseR8_obj) :: this integer, intent(in) :: nrow ! no. of rows integer, intent(in) :: irows(nrow) ! row indices real(r8), intent(in) :: vals(nrow) ! corresponding element values integer, intent(in) :: jcol ! column index integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_get_size subroutine sparseR8_get_size(this, nrows, ncols, ier) ! Return no of rows and columns implicit none type(sparseR8_obj) :: this integer, intent(out) :: nrows, ncols integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_copy subroutine sparseR8_copy(this, that, ier) ! Copy this into that. this should contain data; ! that should be initialized. implicit none type(sparseR8_obj) :: this, that integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_dmark Note: this routine is called automatically if sparseR8_update_element or sparseR8_update_column is called. An explicit call to this routine is needed if user code manipulates object data members directly, without going through method calls. subroutine sparseR8_dmark(this, ier) ! mark that data has changed-- LU factorization needs to be ! recomputed implicit none type(sparseR8_obj) :: this integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_compare_shapes subroutine sparseR8_compare_shapes(this, that, ier) ! Compare two sparse matrices ! status code is set if shapes do not match, or if there is ! some other error, e.g. "this" or "that" not initialized. implicit none type(sparseR8_obj) :: this, that integer, intent(out) :: ier ! status code; 0 = SHAPES MATCH 3 SparseR8_save subroutine sparseR8_save(this, filename, ier) ! Save object data in netCDF file 'filename' use ezcdf implicit none type(sparseR8_obj) :: this character(*), intent(in) :: filename integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_load subroutine sparseR8_load(this, filename, ier) ! Load data from netCDF file 'filename' ! Memory associated with existing data is deallocated first. use ezcdf implicit none type(sparseR8_obj) :: this character(*), intent(in) :: filename integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_solve subroutine sparseR8_solve(this, rhs, ier) ! Solve matrix system A*x=b ! "this" is A, "rhs" is b on input, x on output. implicit none type(sparseR8_obj) :: this real(r8), intent(inout) :: rhs(:) ! right-hand side vector; ! solution vector upon return integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_A_dot_x subroutine sparseR8_A_dot_x(this, x, res, ier) ! matrix . x => res implicit none type(sparseR8_obj) :: this real(r8), intent(in) :: x(:) ! vector real(r8), intent(out) :: res(:) ! result integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_det Caution: determinant value returned in mantissa-exponent form due to overflow possibility for large matrices! subroutine sparseR8_det(this, res_mantissa, res_exponent, ier) ! Return the determinant of A in the form of ! res_mantissa * 2**res_exponent implicit none type(sparseR8_obj) :: this real(r8), intent(out) :: res_mantissa integer, intent(out) :: res_exponent integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_free ! Destructor -- deallocate memory associated with object ! this includes associated C object memory (if any). implicit none type(sparseR8_obj) :: this integer, intent(out) :: ier ! status code; 0=OK 3 SparseR8_error subroutine sparseR8_error(this, nunit, ier) ! Error handler -- write message implicit none type(sparseR8_obj) :: this integer, intent(in) :: nunit ! output channel (Fortran LUN) integer, intent(in) :: ier ! status code **INPUT** 2 C16_methods See the R8_methods: For every sparse_R8_ method, there is a precisely analogous sparseC16_ method, with identical arguments except that the sparseR8_obj arguments are sparseC16_obj arguments instead. There is one exception. Because even R8 matrices can have complex eigenvector solutions, only a C16 version of the following routine is provided: subroutine sparseC16_eigen(Amat, Bmat, x, lambda, tol, nmax, ier) ! Solve A * x = lambda B * x by inverse iterations type(sparseC16_obj) :: Amat, Bmat ! Must have the same sparsity pattern! complex(r8), intent(inout) :: lambda ! initial guess/computed eigenvalue complex(r8), intent(inout) :: x(:) ! initial egenvector guess/computed eigenvector real(r8), intent(inout) :: tol ! tolerance integer, intent(inout) :: nmax ! max number of iterations integer, intent(out) :: ier