MG 3/28/03
I've been following discussions of the TRANSP/MDSplus
implementation for JET and thought I’d set down a few thoughts and notes.
The issues are not completely straightforward because there
are almost always multiple TRANSP runs per tokamak shot. Sites needs to adopt a scheme to manage
these runs within the MDSplus context.
It is also useful to have a well-defined approach for eliminating
erroneous, faulty, or otherwise uninteresting runs that are invariably created.
a. Separate trees for each run, shot-number uses tokamak
shot-number, trees are named TRANSP01, TRANSP02, etc.
advantages: real
shot number is maintained, simple scenario for removing bad runs
disadvantages: need
environmental variables for TRANSP01-TRANSP##
b. Use a single tree for all TRANSP runs, add sub-nodes for
each run
advantages: only one environmental variable needed
disadvantage: With many runs, tree may get very large. removing bad runs requires continual hacking
on otherwise good tree. Still requires
external information to identify runs.
c. Separate trees for each run. Use TRANSP name for all trees, MDSplus shot numbers are arbitrary
IDs. Tokamak shot number is data
contained in the tree.
advantages: Only one
environmental variable needed, simple scenario for removing bad runs.
disadvantages:
Tokamak shot number is not immediately apparent, a mechanism for keeping
track of runs is essential.
How we do things at C-Mod:
We adopted method a. for the TRANSP archives, but use method
c. for "working" runs. Here's
how it works.
1. Using the IDL
widget PRETRANSP, a new run is begun:
This creates a tree with the name TRANSP and a sequential integer for
the MDSplus shot number (We call this the RUNID – a somewhat ambiguous
designation with respect to recent email).
At the same time, this information and the actual C-Mod shot number (and
much more) is entered into a relational database table. Since all our TRANSP tools use the
relational database to find and open runs, the use of an arbitrary RUNID is
more or less transparent to users.
2. Once a run has completed successfully, it can be examined
with another IDL widget called MG (for multigraph - modeled on an older PPPL
tool).
3. After a short period of time (usually only hours or days)
the user decides if the run is worth keeping.
The criteria are not strict. If
the user thinks there is a reasonable chance that they will do physics with the
run or use it as a start for future runs, it should be kept.
4. A third IDL widget (POSTRANP) manages runs. It is used for archiving good runs and
deleting bad ones, etc.
4a. When good runs are archived, the trees are renamed. The shot number is changed from the RUN ID
to the C-Mod shot number and the tree name is changed from TRANSP to TRANSP##
(where ## is a sequential integer, starting at 00 for each different C-Mod
shot.) These changes are reflected in modifications to the relational database
table.
4b. When bad runs are removed, the tree files are deleted
and the corresponding entry in the runs database are marked void (but not
deleted).
(However, if I had it to do over again, I would probably
adopt method c. for all runs. While an
external mechanism for finding runs is required, some sort of run management
scheme is highly desirable (perhaps essential) - it quickly becomes difficult
to remember what runs were done for what purpose. For all the schemes above, one needs a means to decide whether the
1st, 12th, or 53rd run is the one you wish to open. Also, if several people are
doing runs, you can often avoid duplication if you can find out what other
people have done.)
The C-Mod TRANSP implementation has used a relational
database to track runs for almost 10 years now. The IDL/widget tools built for preparing data, visualizing
results, and managing TRANSP runs use a common interface to the database for
selecting from among the code runs. Recognizing that the problem of code run
management was generic, we designed a schema which could be extended to other
codes. This has been adopted by GA who
built a new run selector to access this database. A (slightly out of date) description of the database schema is
attached at the end of this document.)
Sharing this schema has allowed powerful tools to be shared across
applications and institutions. Further
adoption will make it more useful.
Model Tree initialization - Skeleton Trees
Because the list of variables which are output by TRANSP are
determined by the namelist settings at runtime, Doug felt strongly that the model tree should only be a skeleton
with output nodes added on the fly when the code completes. We have debated whether input nodes should
be treated the same way. This is a more
controversial issue since tools developed at MIT and GA for data preparation
need information that has been stored into the model. (This includes a list of default nodes that need to be loaded and
processed and default specifications for the processing.) My feeling is that the best way to handle
this is to have a minimalist model maintained by the TRANSP development team
and a set of site-specific TCL (or IDL) scripts that modify that tree to create
models suitable for each site. So far,
this issue has not been fully resolved.
A common set of
modern tools for preparing data for TRANSP runs, submitting the code,
visualizing the results and managing runs is highly desirable. Starting with the set of IDL tools created
by Jeff Schachter, MIT and GA are in the process of creating such a set. The run preparation tool – PRETRANSP, and
the visualizer – MG are in an advanced state of development. These are based on the assumption that
each run is a separate tree and that the common database schema is used for
managing runs.