I've been following discussions of the TRANSP/MDSplus implementation for JET and thought I’d set down a few thoughts and notes.
The issues are not completely straightforward because there are almost always multiple TRANSP runs per tokamak shot. Sites needs to adopt a scheme to manage these runs within the MDSplus context. It is also useful to have a well-defined approach for eliminating erroneous, faulty, or otherwise uninteresting runs that are invariably created.
a. Separate trees for each run, shot-number uses tokamak shot-number, trees are named TRANSP01, TRANSP02, etc.
advantages: real shot number is maintained, simple scenario for removing bad runs
disadvantages: need environmental variables for TRANSP01-TRANSP##
b. Use a single tree for all TRANSP runs, add sub-nodes for each run
advantages: only one environmental variable needed
disadvantage: With many runs, tree may get very large. removing bad runs requires continual hacking on otherwise good tree. Still requires external information to identify runs.
c. Separate trees for each run. Use TRANSP name for all trees, MDSplus shot numbers are arbitrary IDs. Tokamak shot number is data contained in the tree.
advantages: Only one environmental variable needed, simple scenario for removing bad runs.
disadvantages: Tokamak shot number is not immediately apparent, a mechanism for keeping track of runs is essential.
How we do things at C-Mod:
We adopted method a. for the TRANSP archives, but use method c. for "working" runs. Here's how it works.
1. Using the IDL widget PRETRANSP, a new run is begun: This creates a tree with the name TRANSP and a sequential integer for the MDSplus shot number (We call this the RUNID – a somewhat ambiguous designation with respect to recent email). At the same time, this information and the actual C-Mod shot number (and much more) is entered into a relational database table. Since all our TRANSP tools use the relational database to find and open runs, the use of an arbitrary RUNID is more or less transparent to users.
2. Once a run has completed successfully, it can be examined with another IDL widget called MG (for multigraph - modeled on an older PPPL tool).
3. After a short period of time (usually only hours or days) the user decides if the run is worth keeping. The criteria are not strict. If the user thinks there is a reasonable chance that they will do physics with the run or use it as a start for future runs, it should be kept.
4. A third IDL widget (POSTRANP) manages runs. It is used for archiving good runs and deleting bad ones, etc.
4a. When good runs are archived, the trees are renamed. The shot number is changed from the RUN ID to the C-Mod shot number and the tree name is changed from TRANSP to TRANSP## (where ## is a sequential integer, starting at 00 for each different C-Mod shot.) These changes are reflected in modifications to the relational database table.
4b. When bad runs are removed, the tree files are deleted and the corresponding entry in the runs database are marked void (but not deleted).
(However, if I had it to do over again, I would probably adopt method c. for all runs. While an external mechanism for finding runs is required, some sort of run management scheme is highly desirable (perhaps essential) - it quickly becomes difficult to remember what runs were done for what purpose. For all the schemes above, one needs a means to decide whether the 1st, 12th, or 53rd run is the one you wish to open. Also, if several people are doing runs, you can often avoid duplication if you can find out what other people have done.)
The C-Mod TRANSP implementation has used a relational database to track runs for almost 10 years now. The IDL/widget tools built for preparing data, visualizing results, and managing TRANSP runs use a common interface to the database for selecting from among the code runs. Recognizing that the problem of code run management was generic, we designed a schema which could be extended to other codes. This has been adopted by GA who built a new run selector to access this database. A (slightly out of date) description of the database schema is attached at the end of this document.) Sharing this schema has allowed powerful tools to be shared across applications and institutions. Further adoption will make it more useful.
Model Tree initialization - Skeleton Trees
Because the list of variables which are output by TRANSP are determined by the namelist settings at runtime, Doug felt strongly that the model tree should only be a skeleton with output nodes added on the fly when the code completes. We have debated whether input nodes should be treated the same way. This is a more controversial issue since tools developed at MIT and GA for data preparation need information that has been stored into the model. (This includes a list of default nodes that need to be loaded and processed and default specifications for the processing.) My feeling is that the best way to handle this is to have a minimalist model maintained by the TRANSP development team and a set of site-specific TCL (or IDL) scripts that modify that tree to create models suitable for each site. So far, this issue has not been fully resolved.
A common set of modern tools for preparing data for TRANSP runs, submitting the code, visualizing the results and managing runs is highly desirable. Starting with the set of IDL tools created by Jeff Schachter, MIT and GA are in the process of creating such a set. The run preparation tool – PRETRANSP, and the visualizer – MG are in an advanced state of development. These are based on the assumption that each run is a separate tree and that the common database schema is used for managing runs.