Use python scripts to automate development and job card submission
Simulation is usually connected with large amounts of output but sometimes also with a great variety of input files. Both sides (input and output) need to be handled automatically, and usually depending on one another. Here a collection of tools is presented which can be found in SVN under workingCode/testRun/JobCardManagement.
Why python?
Working on UNIX-like machines a natural choice would be using bash scripts to automate file handling. Such efforts have already been undertaken for GiBUU:
When coming to more complex applications the scripts become practically unmaintainable since the lack of language functionality (e.g. string handling) needs to be compensated by using chains of external programs with varying syntax and error handling. This results in lines like this
free | tr -s ' ' | sed '/^Mem/!d' | cut -d" " -f2-4 >> mem.stats
With python being a mature interpreted language not only the high level (object-oriented) design patterns but also the interactive debugging possibilities are very promising.
A standard development cycle
Here are some solutions to the daily GiBUU routine using bash scripts where the problems are very straight forward and python for the rest. All these scripts are bundled in the directory workingCode/testRun/JobCardManagement (general info in README_ODIC). The [custom] tags mean that prior to executing these script you should check the header of the script for custom PATHs.
Write code and debug
Compilation and execution of development versions of GiBUU (prior to submitting to SVN) should be done on powerful machines. A possibility is to use one of the tp workstations via ssh, in this fashion:
1) Create a debug folder in workingCode
2) execute debug.sh [custom] to sync it to a folder on the nucleus file system
- this makes use of the script compile+test.sh, which compiles the code and submits all the jobs to the local queue
- submission is managed by submit_job.py [custom]
Generate job cards
Once your code runs with the job cards from debug you might want to modify them - to increase statistics or study further effects. Useful tools are:
replace_strings.py
To increase statistics in all job cards, you can set the numEnsembles variable to a higher number.
replace_strings.py --pattern="numEnsembles=",500 *.job
create_values.py
If you want to create a set of job cards where only some parameters vary, e.g. energy_li, try it in the following fashion:
create_values.py --var=energy_li= --low=1. --up=2. --steps=11 *.job
But other similar features like "fixed q" or "transversal analysis" are also implemented.
Submit all job cards
Once you have a set of job cards ready, you would like to have them computed on a cluster. One possibility is copying them by hand and writing a submit script for each, then submitting each, then collecting them by random output numbers separately. Another is to let send_jobs.py [custom] do all this in a fashion like
send_jobs.py --machine=skylla --queue=serial --project=p1 *.job
This simply means that all jobs will be copied to the skylla cluster, where they will be submitted to the serial queue and the links to the processes and the results collected in the folder p1.
send_jobs.py *.job
will send all jobs to hadron and create a meaningful project title by default.
Mostly this routine relies on submit_job.py [custom] and manage_jobs.py [custom], where jobs are submitted, linked and managed on the cluster side (necessitates python version >=2.5)
Manage the output
Folder structure
If everything worked properly your project folder will contain only the files done and jobcards. Within the folder done you will find the results of the different job cards in folders named according to them. These folders will NOT contain redundant files like 'GiBUU.x'. In addition if you specified a --target, you will extract special files from the result directories which can be used for direct plotting.
Most of this sorting is done by sort_output.py and harvest_output.py.
Data file manipulation
Of course there are many UNIX tools like awk to handle data files. Useful shorthands are however:
collect_data.py
collect data from different output files into one
extract_data.py
Extract columns from csv like data files to new files
plot_dat.py
plot multiple files into one graph
rename_files.py
rename files according to strings contained within them
Developing the scripts
As most modules have a doctest routine you can check if your contributions didn't break everything by using
module.py --doctest
or doctest_all.sh.
Also you can have a look at the other files in the directory which may not have a lot to do with GiBUU, or are still in alpha phase and thus not documented here. Do not hesitate to contact the author (Ivan.Lappo-Danilevski@theo.physik.uni-giessen.de).
Here you can see an example of the entire formalism in action