Tracing MPI Jobs
- Find out where jobs are running:
- qstat -n1 | grep <user>
shows pbsid and nodes
-or -
- qstat <pbsid> -f
- See what is running:
ssh <node> ps -efw --forest |grep <user>
- $WORKDIR
WORKDIR is /l/<node>/<user>/work/<TOK>
nubeam files are in: /l/<node>/<user>/work/<TOK>/<runid>_fi/
- pbs log
- log into node of rank 0 (serial job)
ssh -l <user> <node>
- more /var/spool/torque/spool/<pbsid>.bennu.OU
- killing a hang job
and retrieving files
- ssh <node> ps -ef |grep EXE
to get pid
- ssh <node> kill <pid>
Home
Last modified: Tue Jun 23 14:22:26 EDT 2009