User Tools

Site Tools







  • Basic components
  • Application support
  • Additional components
    • manual:3gbplugin-metajob

      Metajob Plugin Configuration and User Manual

      This document details the configuration and use of the 3G Bridge Metajob feature.

      User Manual

      The Metajob feature of the 3G Bridge enables the user to submit a batch of jobs exactly as they would a single job. Because of this transparency, the Metajob feature can be used even when jobs are submitted to the Bridge through the gLite infrastructure.

      Formally, a Metajob is a 3G Bridge job with and extra input file—called the Metajob file— with the name prefix '_3gb-metajob' . This extra input file contains the definition of the sub-jobs of the Metajob.

      The Metajob itself is used as a template for its sub-jobs. The Metajob file contains instructions that modify the current template, and instructions to create sub-jobs from the current template.

      Submitting a Metajob

      To submit a Metajob, you must create the Metajob definition first, and then, submit it to the 3G Bridge. As the information specified in the submission is used as the initial template for the sub-jobs, the example submission is presented first:

      wsclient -m add -e '' \
        -g test -n app \
        -i _3gb-metajob-example= \
        -i alpha.txt= \
        -i beta.txt= \
        -a '--p=1 --in1=alpha.txt --in2=beta.txt'
        -o result.txt
        -o stats.txt

      Notice that aside from the extra input file '_3gb-metajob-example' , this submission is just an ordinary job submission. All attributes, including the target queue (grid+algorithm name), is the same as if you'd submit a single job. This submission—excluding the Metajob definition file—will be used as the initial template. The current state of the template can be changed in the Metajob definition:

      Arguments=--p=2 --in1=alpha.txt --in2=beta.txt

      These commands can be defined multiple times; all occurrence will change the current template, overwriting its previous state. When the current template describes a sub-job we want to submit, a sub-job—or several identical sub-jobs—can be instantiated with the Queue command:


      or, for example,

      Queue 10

      The Note that:

      1. For the input files, only their location can be redefined. The 'Input=alpha.txt' command sets the location of alpha.txt in the current template (same for beta.txt). No new input files can be defined and none of them can be removed from the template. This implies that
        1. All sub-jobs have to have the same logical set of input files.
        2. Only remote files can be used as input files.
      2. The output file set cannot be changed either.
      3. Remote files can be specified with the BOINC syntax (with MD5 and size).
      4. The Arguments command does not need parentheses.

      Controlling the execution of the batch

      As the Metajob is itself a 3G Bridge job, it has a status attribute, which must must be determined based on the state of its sub-jobs. The trivial case is when all sub-jobs have finished successfully; in this case, the status of the Metajob can be FINISHED. It is also trivial, when all sub-jobs have failed. But what happens, when some of the jobs have successfully finished, while others have failed? The 3G Bridge allows the user to control its behaviour in these intermediary cases. In the Metajob file, the user can specify the minimum and maximum number of sub-jobs they need to successfully finish.

      The lower limit tells the Bridge that the whole Metajob has to be considered failed, and no output is produced if less than this number of sub-jobs have successfully finished. If the number of failed sub-jobs reaches the point where the lower limit becomes impossible to reach, the Bridge prematurely cancels all pending sub-jobs, and the Metajob fails immediately. This is useful when the result of the batch is useful, and sub-results are not needed. If this limit is set to 1, any successful sub-result will be available after the Metajob has finished.

      The upper limit tells the Bridge that no more than this number of sub-results is needed. If the number of successfully finished sub-jobs reaches this number, the Bridge cancels all pending sub-jobs, and the Metajob finishes immediately. This is useful to introduce redundancy at user level, if the results are interchangeable (Monte Carlo simulations for example).

      These limits can be specified in the Metajob file; both %Minimum and %Maximum can be specified once. This example shows how the limits can be specified. The term All and percentages refer to the number of sub-jobs defined in the Metajob file.

      Default value for both limits is All.

      %Minimum 5
      %Minimum 20%
      %Minimum All
      %Maximum 50
      %Maximum 80%
      %Maximum All

      Example Metajob submission

      The following example shows a correct Metajob submission. The first Queue will create 100 instances of the initial template matching the submission information. The second 100 sub-jobs will have the source of alpha.txt changed, all other attributes being unchanged; etc.

      wsclient -m add -e '' \
        -g test -n app \
        -i _3gb-metajob-example= \
        -i alpha.txt= \
        -i beta.txt= \
        -a '--p=1 --in1=alpha.txt --in2=beta.txt'
        -o result.txt
        -o stats.txt
      manual/3gbplugin-metajob.txt · Last modified: 2013/01/18 09:22 by a.visegradi