Docs: FAQ

I want to use hypre boomerAMG without GMRES but when I run -pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly I don't get a very accurate answer!
You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ?
How do I use PETSc for domain decomposition?
Can I create BAIJ matrices with different size blocks for different block rows?

What kind of parallel computers or clusters are needed to use PETSc?

PETSc can be used with any kind of parallel system that supports MPI. BUT for any decent performance one needs

a fast, low-latency interconnect; any ethernet, even 10 gigE simply cannot provide the needed performance.
high per-CPU memory performance. Each CPU (core in dual core systems) needs to have its own memory bandwith of roughly 2 or more gigabytes. For example, standard dual processor "PC's" will not provide better performance when the second processor is used, that is, you will not see speed-up when you using the second processor. This is because the speed of sparse matrix computations is almost totally determined by the speed of the memory, not the speed of the CPU.

What kind of license is PETSc released under?

See the copyright notice.

Why is PETSc programmed in C, instead of Fortran or C++?

C enables us to build data structures for storing sparse matrices, solver information, etc. in ways that Fortran simply does not allow. ANSI C is a complete standard that all modern C compilers support. The language is identical on all machines. C++ is still evolving and compilers on different machines are not identical. Using C function pointers to provide data encapsulation and polymorphism allows us to get many of the advantages of C++ without using such a large and more complicated language. It would be natural and reasonable to have coded PETSc in C++; we opted to use C instead.

Does all the PETSc error checking and logging reduce PETSc's efficiency?

No,

How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?

a) We work very efficiently.

We use Emacs for all editing; the etags feature makes navigating and changing our source code very easy.
Our manual pages are generated automatically from formatted comments in the code, thus alleviating the need for creating and maintaining manual pages.
We employ automatic nightly tests of PETSc on several different machine architectures. This process helps us to discover problems the day after we have introduced them rather than weeks or months later.

b) We are very careful in our design (and are constantly revising our design) to make the package easy to use, write, and maintain.

c) We are willing to do the grunt work of going through all the code regularly to make sure that all code conforms to our interface design. We will never keep in a bad design decision simply because changing it will require a lot of editing; we do a lot of editing.

d) We constantly seek out and experiment with new design ideas; we retain the the useful ones and discard the rest. All of these decisions are based on practicality.

e) Function and variable names are chosen to be very consistent throughout the software. Even the rules about capitalization are designed to make it easy to figure out the name of a particular object or routine. Our memories are terrible, so careful consistent naming puts less stress on our limited human RAM.

f) The PETSc directory tree is carefully designed to make it easy to move throughout the entire package.

g) Our bug reporting system, based on email to petsc-maint@mcs.anl.gov, makes it very simple to keep track of what bugs have been found and fixed. In addition, the bug report system retains an archive of all reported problems and fixes, so it is easy to refind fixes to previously discovered problems.

h) We contain the complexity of PETSc by using object-oriented programming techniques including data encapsulation (this is why your program cannot, for example, look directly at what is inside the object Mat) and polymorphism (you call MatMult() regardless of whether your matrix is dense, sparse, parallel or sequential; you don't call a different routine for each format).

i) We try to provide the functionality requested by our users.

j) We never sleep.

How do I know the amount of time spent on each level of the multigrid solver/preconditioner?

Run with -log_summary and -pc_mg_log

How do I collect all the values from a parallel PETSc vector into a sequential vector on each processor?

Create the scatter context that will do the communication
VecScatterCreateToAll(v,&ctx,&w);

Actually do the communication; this can be done repeatedly as needed

VecScatterBegin(v,w,INSERT_VALUES,SCATTER_FORWARD,ctx);
VecScatterEnd(v,w,INSERT_VALUES,SCATTER_FORWARD,ctx);

Remember to free the scatter context when no longer needed

VecScatterDestroy(ctx);

Note that this simply concatenates in the parallel ordering of the vector. If you are using a vector from DACreateGlobalVector() you likely want to first call DAGlobalToNaturalBegin/End() to scatter the original vector into the natural ordering in a new global vector before calling VecScatterBegin/End() to scatter the natural vector onto all processes.

How do I collect all the values from a parallel PETSc vector into a vector on the zeroth processor?

Create the scatter context that will do the communication

VecScatterCreateToZero(v,&ctx,&w);

Actually do the communication; this can be done repeatedly as needed

VecScatterBegin(v,w,INSERT_VALUES,SCATTER_FORWARD,ctx);
VecScatterEnd(v,w,INSERT_VALUES,SCATTER_FORWARD,ctx);

Remember to free the scatter context when no longer needed

VecScatterDestroy(ctx);

Installation

How do I begin using PETSc if the software has already been completely built and installed by someone else?

Assuming that the PETSc libraries have been successfully built for a particular architecture and level of optimization, a new user must merely:

a) Set the environmental variable PETSC_DIR to the full path of the PETSc home directory (for example, /home/username/petsc).

b) Set the environmental variable PETSC_ARCH, which indicates the configuration on which PETSc will be used. Note that the PETSC_ARCH is simply a name the installer used when installing the libraries. There many be several on a single system, like mylinux-g for the debug versions of the library and mylinux-O for the optimized version, or petscdebug for the debug version and petscopt for the optimized version.

c) Begin by copying one of the many PETSc examples (in, for example, petsc/src/ksp/examples/tutorials) and its corresponding makefile.

d) See the introductory section of the PETSc users manual for tips on documentation.

The PETSc distribution is SO large. How can I reduce my disk space usage?

a) The directory ${PETSC_DIR}/docs contains a set of HTML manual pages in for use with a browser. You can delete these pages to save about .8 Mbyte of space.

b) The PETSc users manual is provided in PDF in ${PETSC_DIR}/docs/manual.pdf. You can delete this.

c) The PETSc test suite contains sample output for many of the examples. These are contained in the PETSc directories ${PETSC_DIR}/src/*/examples/tutorials/output and ${PETSC_DIR}/src/*/examples/tests/output. Once you have run the test examples, you may remove all of these directories to save about 300 Kbytes of disk space.

d) The debugging versions of the libraries are larger than the optimized versions . In a pinch you can work with the optimized version although we do not recommend it generally because finding bugs is much easier with the debug version.

e) I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?

No, run config/configure.py with the option --with-mpi=0

Can I install PETSc to not use X windows (either under Unix or Windows with gcc, the gnu compiler)?

Yes. Run config/configure.py with the additional flag --with-x=0

Why do you use MPI?

MPI is the message-passing standard. Because it is a standard, it will not change over time; thus, we do not have to change PETSc every time the provider of the message-passing system decides to make an interface change. MPI was carefully designed by experts from industry, academia, and government labs to provide the highest quality performance and capability. For example, the careful design of communicators in MPI allows the easy nesting of different libraries; no other message-passing system provides this support. All of the major parallel computer vendors were involved in the design of MPI and have committed to providing quality implementations. In addition, since MPI is a standard, several different groups have already provided complete free implementations. Thus, one does not have to rely on the technical skills of one particular group to provide the message-passing libraries. Today, MPI is the only practical, portable approach to writing efficient parallel numerical software.

What do I do if my MPI compiler wrappers are invalid?

Most MPI implementations provide compiler wrappers (such as mpicc) which give the include and link options necessary to use that verson of MPI to the underlying compilers . These wrappers are either absent or broken in the MPI pointed to by --with-mpi-dir. You can rerun the configure with the additional option --with-mpi-compilers=0, which will try to auto-detect working compilers; however, these compilers may be incompatible with the particular MPI build. If this fix does not work, run with --with-cc=c_compiler where you know c_compiler works with this particular MPI, and likewise for C++ and Fortran.

Using

I want to use hypre boomerAMG without GMRES but when I run -pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly I don't get a very accurate answer!

You should run with -ksp_type richardson to have PETSc run several V or W cycles. -ksp_type of preonly causes boomerAMG to use only one V/W cycle. You can control how many cycles are used in a single application of the boomerAMG preconditioner with -pc_hypre_boomeramg_max_iter (the default is 1). You can also control the tolerance boomerAMG uses to decide if to stop before max_iter with -pc_hypre_boomeramg_tol (the default is 1.e-7). Run with -ksp_view to see all the hypre options used and -help | grep boomeramg to see all the command line options.

You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ

Just for historical reasons, the SBAIJ format with blocksize one is just as efficient as an SAIJ would be

How do I use PETSc for Domain Decomposition?

PETSc includes Additive Schwarz methods in the suite of preconditioners. These may be activated with the runtime option
-pc_type asm.
Various other options may be set, including the degree of overlap
-pc_asm_overlap <number>
the type of restriction/extension
-pc_asm_type [basic,restrict,interpolate,none] - Sets ASM type and several others. You may see the available ASM options by using
-pc_type asm -help
Also, see the procedural interfaces in the manual pages, with names PCASMxxxx()
and check the index of the users manual for PCASMxxx().

Note that Paulo Goldfeld contributed a preconditioner "nn", a version of your Neumann-Neumann balancing preconditioner; this may be activated via
-pc_type nn
The program petsc/src/contrib/oberman/laplacian_ql contains an example of its use.

Can I create BAIJ matrices with different size blocks for different block rows?

Sorry, this is not possible, the BAIJ format only supports a single fixed block size on the entire matrix. But the AIJ format automatically searches for matching rows and thus still takes advantage of the natural blocks in your matrix to obtain good performance. Unfortunately you cannot use the MatSetValuesBlocked().

Execution

PETSc executables are SO big and take SO long to link.

We find this annoying as well. On most machines PETSc can use shared libraries, so executables should be much smaller, run config/configure.py with the additional option --with-shared. Also, if you have room, compiling and linking PETSc on your machine's /tmp disk or similar local disk, rather than over the network will be much faster.

PETSc has so many options for my program that it is hard to keep them straight.

Running the PETSc program with the option -help will print of many of the options. To print the options that have been specified within a program, employ -optionsleft to print any options that the user specified but were not actually used by the program and all options used; this is helpful for detecting typo errors.

PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?

You can use the option -info to get more details about the solution process. The option -log_summary provides details about the distribution of time spent in the various phases of the solution process. You can use ${PETSC_DIR}/bin/petscview, which is a Tk/Tcl utility that provides high-level visualization of the computations within a PETSc program. This tool illustrates the changing relationships among objects during program execution in the form of a dynamic icon tree.

Assembling large sparse matrices takes a long time. What can I do make this process faster?

See the Performance chapter of the users manual for many tips on this.

a) Preallocate enough space for the sparse matrix. For example, rather than calling MatCreateSeqAIJ(comm,n,n,0,PETSC_NULL,&mat); call MatCreateSeqAIJ(comm,n,n,rowmax,PETSC_NULL,&mat); where rowmax is the maximum number of nonzeros expected per row. Or if you know the number of nonzeros per row, you can pass this information in instead of the PETSC_NULL argument. See the manual pages for each of the MatCreateXXX() routines.

b) Insert blocks of values into the matrix, rather than individual components.

How can I generate performance summaries with PETSc?

Use these options at runtime: -log_summary. See the Performance chapter of the users manual for information on interpreting the summary data. If using the PETSc (non)linear solvers, one can also specify -snes_view or -ksp_view for a printout of solver info. Only the highest level PETSc object used needs to specify the view option.

Why do I get different answers on a different numbers of processors?

Most commonly, you are using a preconditioner which behaves differently based upon the number of processors, such as Block-Jacobi which is the PETSc default. However, since computations are reordered in parallel, small roundoff errors will still be present with identical mathematical formulations. If you set a tighter linear solver tolerance (using -ksp_rtol), the differences will decrease.

Debugging

How do I turn off PETSc signal handling so I can use the -C option on xlF?

Immediately after calling PetscInitialize() call PetscPopSignalHandler()

Some Fortran compilers including the IBM xlf, xlF etc compilers have a compile option (-C for IBM's) that causes all array access in Fortran to be checked that they are in-bounds. This is a great feature but does require that the array dimensions be set explicitly, not with a *.

How do I debug on the Cray T3D/T3E?

Use TotalView. First, link your program with the additional option -Xn where n is the number of processors to use when debugging. Then run totalview programname -a your arguments The -a is used to distinguish between totalview arguments and yours.

How do I debug if -start_in_debugger does not work on my machine?

For a uniprocessor job, just try the debugger directly, for example: gdb ex1

How do I see where my code is hanging?

You can use the -start_in_debugger option to start all processes in the debugger (each will come up in its own xterm). Then use cont (for continue) in each xterm. Once you are sure that the program is hanging, hit control-c in each xterm and then use 'where' to print a stack trace for each process.

How can I inspect Vec and Mat values when in the debugger?

I will illustrate this with gdb, but it should be similar on other debuggers. You can look at local Vec values directly by obtaining the array. For a Vec v, we can print all local values using

(gdb) p ((Vec_Seq*) v->data)->array[0]@v->n

However, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a Vec v and a Mat m, this would be

(gdb) call VecView(v, 0)

(gdb) call MatView(m, 0)

or with a communicator other than MPI_COMM_WORLD,

(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))

Shared Libraries

Can I install PETSc libraries as shared libraries?

Yes. Use the config/configure.py option --with-shared

Why should I use shared libraries?

When you link to shared libraries, the function symbols from the shared libraries are not copied in the executable. This way the size of the executable is considerably smaller than when using regular libraries. This helps in a couple of ways:
1) saves disk space when more than one executable is created, and
2) improves the compile time immensly, because the compiler has to write a much smaller file (executable) to the disk.

How do I link to the PETSc shared libraries?

By default, the compiler should pick up the shared libraries instead of the regular ones. Nothing special should be done for this.

What If I want to link to the regular .a library files?

You must run config/configure.py without the option --with-shared (you can use a different PETSC_ARCH for this build so you can easily switch between the two).

What do I do if I want to move my executable to a different machine?

You would also need to have access to the shared libraries on this new machine. The other alternative is to build the exeutable without shared libraries by first deleting the shared libraries, and then creating the executable.

What is the deal with dynamic libraries (and difference between shared libraries)

PETSc libraries are installed as dynamic libraries when the config/configure.py flag --with-dynamic is used. The difference with this - from shared libraries - is the way the libraries are used. From the program the library is loaded using dlopen() - and the functions are searched using dlsymm(). This separates the resolution of function names from link-time to run-time - i.e when dlopen()/dlsymm() are called.

When using Dynamic libraries - PETSc libraries cannot be moved to a different location after they are built.