== Use python scripts to automate development and job card submission == Simulation is usually connected with large amounts of output but sometimes also with a great variety of input files. Both sides (input and output) need to be handled automatically, and usually depending on one another. Here a collection of tools is presented which can be found in SVN under ''workingCode/testRun/JobCardManagement''. === Why python? === Working on UNIX-like machines a natural choice would be using bash scripts to automate file handling. Such efforts have already been undertaken for GiBUU: * [http://gibuu.physik.uni-giessen.de/internalFiles More internal informations (only available within local 134.176.18.* network)] When coming to more complex applications the scripts become practically unmaintainable since the lack of language functionality (e.g. string handling) needs to be compensated by using chains of external programs with varying syntax and error handling. This results in lines like this {{{ free | tr -s ' ' | sed '/^Mem/!d' | cut -d" " -f2-4 >> mem.stats }}} With python being a mature interpreted language not only the high level (object-oriented) design patterns but also the interactive debugging possibilities are very promising. == A standard development cycle == Here are some solutions to the daily GiBUU routine using bash scripts where the problems are very straight forward and python for the rest. All these scripts are bundled in the directory ''workingCode/testRun/JobCardManagement'' (general info in '''README_ODIC'''). The [custom] tags mean that prior to executing these script you should check the header of the script for custom PATHs. === Write code and debug === Compilation and execution of development versions of GiBUU (prior to submitting to SVN) should be done on powerful machines. A possibility is to use one of the ''tp'' workstations via ssh, in this fashion: 1) Create a ''debug'' folder in ''workingCode'' 2) execute '''debug.sh''' [custom] to sync it to a folder on the nucleus file system * this makes use of the script '''compile+test.sh''', which compiles the code and submits all the jobs to the local queue * submission is managed by '''submit_job.py''' [custom] === Generate job cards === Once your code runs with the job cards from ''debug'' you might want to modify them - to increase statistics or study further effects. Useful tools are: ==== replace_strings.py ==== To increase statistics in all job cards, you can set the ''numEnsembles'' variable to a higher number. {{{ replace_strings.py --pattern="numEnsembles=",500 *.job }}} ==== create_values.py ==== If you want to create a set of job cards where only some parameters vary, e.g. ''energy_li'', try it in the following fashion: {{{ create_values.py --var=energy_li= --low=1. --up=2. --steps=11 *.job }}} But other similar features like "fixed q" or "transversal analysis" are also implemented. === Submit all job cards === Once you have a set of job cards ready, you would like to have them computed on a cluster. One possibility is copying them by hand and writing a submit script for each, then submitting each, then collecting them by random output numbers separately. Another is to let '''send_jobs.py''' [custom] do all this in a fashion like {{{ send_jobs.py --machine=skylla --queue=serial --project=p1 *.job }}} This simply means that all jobs will be copied to the ''skylla'' cluster, where they will be submitted to the ''serial'' queue and the links to the processes and the results collected in the folder ''p1''. {{{ send_jobs.py *.job }}} will send all jobs to ''hadron'' and create a meaningful project title by default. Mostly this routine relies on '''submit_job.py''' [custom] and '''manage_jobs.py''' [custom], where jobs are submitted, linked and managed on the cluster side (necessitates python version >=2.5) === Manage the output === ==== Folder structure ==== If everything worked properly your project folder will contain only the files ''done'' and ''jobcards''. Within the folder ''done'' you will find the results of the different job cards in folders named according to them. These folders will NOT contain redundant files like 'GiBUU.x'. In addition if you specified a ''--target'', you will extract special files from the result directories which can be used for direct plotting. Most of this sorting is done by '''sort_output.py''' and '''harvest_output.py'''. ==== Data file manipulation ==== Of course there are many UNIX tools like ''awk'' to handle data files. Useful shorthands are however: ==== collect_data.py ==== collect data from different output files into one ==== extract_data.py ==== Extract columns from csv like data files to new files ==== plot_dat.py ==== plot multiple files into one graph ==== rename_files.py ==== rename files according to strings contained within them == Developing the scripts == As most modules have a ''doctest'' routine you can check if your contributions didn't break everything by using {{{ module.py --doctest }}} or '''doctest_all.sh'''. Also you can have a look at the other files in the directory which may not have a lot to do with GiBUU, or are still in alpha phase and thus not documented here. Do not hesitate to contact the author (Ivan.Lappo-Danilevski@theo.physik.uni-giessen.de). Here you can see an example of the [wiki:JobCardManagmentExample entire formalism in action]