.. _chapter_benchmarks:

**********
Benchmarks
**********

Performance, and specifically improved application performance, is a main
objective for the existence of RADICAL-Pilot.  To enable users to understand
performance of both RADICAL-Pilot itself and of the applications executed with
RADICAL-Pilot, we provide some utilities for benchmarking and performance
analysis.

.. note::

    Performance profiling is enabled by setting `RADICAL_PILOT_PROFILE` in the
    application environment.  If profiling is enabled, the application can
    request any number of cores on the resource `local.localhost`.

During operation, RADICAL-Pilot stores time stamps of different events and
activities in MongoDB, under the ID of the `radical.pilot.Session`.  That
information can be used for post mortem performance analysis.  To do so, one
needs to specify the session ID to be examined -- you can print the session ID
when running your application, via

.. code-block:: python

    print "session id: %s" % session.uid

With that session ID, you can use the tool `radicalpilot-stats` to print some
statistics, and to plot some performance graphs:

  
.. code-block:: bash

    $ radicalpilot-stats -m plot -s 53b5bbd174df926f4a4d3318

This command will, in the `plot` mode shown above, produce
a `53b5bbd174df926f4a4d3318.png` and a  `53b5bbd174df926f4a4d3318.pdf` plot
(where `53b5bbd174df926f4a4d3318` is the session ID as mentioned.  The same
command has other modi for inspecting sessions -- you can see a help message via

.. code-block:: bash

      $ ./bin/radicalpilot-stats -m help

      usage   : ./bin/radicalpilot-stats -m mode [-d dburl] [-s session]
      example : ./bin/radicalpilot-stats -m stats -d mongodb://localhost/radicalpilot -s 536afe101d41c83696ea0135

      modes :

        help  : show this message
        list  : show  a  list   of sessions in the database
        tree  : show  a  tree   of session objects
        dump  : show  a  tree   of session objects, with full details
        sort  : show  a  list   of session objects, sorted by type
        hist  : show timeline   of session history
        stat  : show statistics of session history (not implemented)
        plot  : save gnuplot representing session history

      The default command is 'list'.  If no session ID is specified, operations
      which apply to a single session will choose the last session in the given
      DB.  The default MongoDB is 'mongodb://ec2-184-72-89-141.compute-1.amazonaws.com:27017/radicalpilot/'
      
An exemplar performance plot is included below.  It represents a number of
events and metrics, represented over a time axis.  In particular, it shows (at
the bottom) the utilization of the various compute cores managed by the pilots
in the session -- if that utilization is showing no major gaps, your
application should make efficient use of the allocated resources.

.. image:: images/rp.benchmark.png
    :width: 600pt
    :align: center

Note that the plotting capability needs an up-to-date installation of gnuoplot
with the cairo-png backend.  For Linux, that can be installed from the usual
package repositories.  For MacOS, the following should take care of the
installation:

.. code-block:: bash
      # Install and configure brew: http://brew.sh/
      # Install xquartz. Download the dmg package from http://xquartz.macosforge.org/landing/
      # From a terminal issue the following commands:
      
      $ brew install cairo
      $ brew install -v gnuplot --pdf --cairo --latex --with-x --wx


********************
Details on Profiling
********************

.. note::

    This section is for developers, and should be disregarded for production
    runs and 'normal' users in general.

RADICAL-Pilot allows to tweak the pilot process behavior in many details, and
specifically allows to artificially increase the load on individual components,
for the purpose of more detailed profiling, and identification of bottlenecks.
With that background, a pilot description supports an additional attribute
`_config`, which accepts a dict of the following structure:

.. code-block:: python

        pdesc = rp.ComputePilotDescription()
        pdesc.resource = "local.localhost"
        pdesc.runtime  = 5 # minutes
        pdesc.cores    = 8
        pdesc.cleanup  = False
        pdesc._config  = {'number_of_workers' : {'StageinWorker'   :  1,
                                                 'ExecWorker'      :  2,
                                                 'StageoutWorker'  :  1,
                                                 'UpdateWorker'    :  1},
                          'blowup_factor'     : {'Agent'           :  1,
                                                 'stagein_queue'   :  1,
                                                 'StageinWorker'   :  1,
                                                 'schedule_queue'  :  1,
                                                 'Scheduler'       :  1,
                                                 'execution_queue' : 10,
                                                 'ExecWorker'      :  1,
                                                 'watch_queue'     :  1,
                                                 'Watcher'         :  1,
                                                 'stageout_queue'  :  1,
                                                 'StageoutWorker'  :  1,
                                                 'update_queue'    :  1,
                                                 'UpdateWorker'    :  1},
                          'drop_clones'       : {'Agent'           :  1,
                                                 'stagein_queue'   :  1,
                                                 'StageinWorker'   :  1,
                                                 'schedule_queue'  :  1,
                                                 'Scheduler'       :  1,
                                                 'execution_queue' :  1,
                                                 'ExecWorker'      :  0,
                                                 'watch_queue'     :  0,
                                                 'Watcher'         :  0,
                                                 'stageout_queue'  :  1,
                                                 'StageoutWorker'  :  1,
                                                 'update_queue'    :  1,
                                                 'UpdateWorker'    :  1}}

That configuration tunes the concurrency of some of the pilot components (here
we use two `ExecWorker` instances to spawn units.  Further, we request that the
number of compute units handled by the `ExecWorker` is 'blown up' (multiplied)
by 10.  This will created 9 near-identical units for every unit which enters
that component, and thus the load increases on that specific component, but not
on any of the previous ones.  Finally, we instruct all components but the
`ExecWorker`, `watch_queue` and `Watcher` to drop the clones again, so that
later components won't see those clones eiter.  We thus strain only a specific
part of the pilot.

Setting these parameters requires some understanding of the pilot architecture.
While in general the application semantics remains unaltered, these parameters
do significantly alter resource consumption.  Also, there do exist invalid
combinations which will cause the agent to fail, specifically it will usually be
invalid to push updates of cloned units to the client module (via MongoDB).

The pilot profiling (as stored in `agent.prof` in the pilot sandbox) will
contain timings for the cloned units.  The unit IDs will be based upon the
original unit IDs, but have an appendix `.clone.0001` etc., depending on the
value of the respective blowup factor.  In general, only one of the
blowup-factors should be larger than one (otherwise the number of units will
grow exponentially, which is probably not what you want).